Ange gene in atopy

ABSTRACT

The present invention relates to isolated nucleic acid sequences of ANGE, CLLD8 and CLLD7 or sequences complementary or substantially homologous thereto or fragments thereof. Also provided are sequences comprising hybrid nucleic acid sequences from two or more of the genes. Also provided are nucleic acid expression vectors, polypeptides, antibodies to the polypeptides, host cells, non-human transgenic animals and pharmaceutical compositions and agents. Also provided is the use of the nucleic acid sequence and/or protein in medicine and research, methods for diagnosing or determining predisposition to disease or severity of disease, methods for preventing or treating disease, and kits for use in the methods and the use of the nucleic acid sequence and protein in treating or preventing IgE mediated diseases and non-atopic asthma, and in screens for identifying new agents for use in the methods.

The present invention relates to isolated nucleic acid sequences of ANGE, CLLD8 and CLLD7 or sequences complementary or substantially homologous thereto or fragments thereof. Also provided are sequences comprising hybrid nucleic acid sequences from two or more of the genes. Also provided are nucleic acid expression vectors, polypeptides, antibodies to the polypeptides, host cells, non-human transgenic animals and pharmaceutical compositions and agents. Also provided is the use of the nucleic acid sequence and/or protein in medicine and research, methods for diagnosing or determining predisposition to disease or severity of disease, methods for preventing or treating disease, and kits for use in the methods and the use of the nucleic acid sequence and protein in treating or preventing IgE mediated diseases and non-atopic asthma, and in screens for identifying new agents for use in the methods.

Atopic or Immunoglobulin E (IgE) mediated diseases include, but are not limited to, asthma, hayfever, eczema, atopic dermatitis and allergic rhinitis. These disorders are a major cause of disease in children and young adults (Jarvis, D. & Burney, P. British Medical Journal 316, 607-10 (1998) [published erratum appears in BMJ 1998 Apr. 4; 316(7137):1078]; and Cookson, W. Nature 402, B5-11 (1999).

Atopy and asthma are due to the interaction between strong environmental and genetic factors (Cookson, W. Nature 402, B5-11 (1999). Asthma is usually recognised epidemiologically by standard symptom questionnaires or by physician diagnosis (O'Connor, G. T. & Weiss, S. T. Am J Respir Crit. Care Med 149, S21-8; discussion S29-30 (1994)). Atopy is detected by skin prick tests, or by measurement of specific serum IgE titres against allergens with RAST or ELISA techniques, or by quantifying the total serum IgE. The examination of quantitative traits offers significant advantages for both linkage and association analyses in general (Risch, N. J. & Zhang, H. Am J Hum Genet 58, 836-43 (1996)) and in the case of asthma (Cookson, W. & Palmer, L. Clin Exp Allergy 28 Suppl 1, 88-9; discussion 108-10 (1998)). A number of quantitative traits underlie asthma and atopy, including the total serum IgE concentration, the Skin Test Index (STI), the RAST index and the Dose-Response Slope (DRS) of bronchial responsiveness to methacholine (Daniels, S. E. et al. Nature 383, 247-50 (1996)). The total serum IgE is log-normally distributed, and has a high heritability (Gerrard, J., Rao, D. & Morton, N. Am J Hum Genet. 30, p 46-58 (1978)) and Palmer, L. J. et al. Am J Respir Crit. Care Med 161, 1836-43 (2000)). It is influenced by genetic effects, which incompletely overlap DRS and the STI.

The heritability of physician-diagnosed asthma is 60-70%⁷ and that of the (log normal) total serum IgE concentration is 40-50%^(8,9). The heritability of the STI is lower and is approximately 30%⁹. The examination of quantitative rather than categorical traits offers significant advantages of power for both linkage and association analyses in generally and in the case of asthma¹¹. The total serum IgE is log-normally distributed with standardised measurement protocols, and the effects of age and sex are well defined¹². Consequently, we have used the total serum IgE as quantitative trait to map susceptibility genes for atopy and asthma.

Differing indices of atopy may be elevated in the same family (Cookson, W. O. C. M. & Hopkin, J. M. Lancet 1, 86-88 (1988) and Young, R. P., Lynch, J., Sharp, P. A. & et al. Journal of Medical Genetics 29, 236-238 (1992)). RAST and skin test responses reach a peak later in childhood than the total serum IgE, and decline at a slower rate thereafter (Cline, M. G. & Burrows, B. B. Thorax 44, 425-431 (1989)). To account for this heterogeneity of phenotype (pleiotropy), the categorical trait of “atopy” is based on a combination of the STI, RAST index, and the total serum IgE (Daniels, S. E. et al. Nature 383, 247-50 (1996) and Cookson, W. O. C. M. & Hopkin, J. M. Lancet 1, 86-88 (1988)).

Atopy is due to the interaction between genetic and environmental factors. The genetic factors are thought to be variants of DNA structure (“polymorphisms”) that alter the level of expression or the function of genes to predispose to asthma. Variants of DNA sequence at a particular site (“locus”) are known as “alleles”. Genome-wide scans for linkage to atopy and asthma-associated phenotypes have been conducted (Daniels, S. E. et al. Nature 383, 247-50 (1996)). Strong linkage of the atopy phenotype to chromosome 13q14 was observed, and confirmed in a second panel of families at the time of our initial genome screen. An earlier study had found linkage of the total serum IgE to the esterase D (ESD) protein polymorphism on chromosome 13q14 (Eiberg, H., et al. Cytogenetics. And Cell Genetics 40, 622 (1985)). Linkage to the region has also been confirmed by a single locus study of Japanese families (Kimura, K. et al. Hum Mol Genet. 8, 1487-90 (1999)). A two-stage screen in Hutterite families from the US found linkage of asthma; to 13q21.3 (Ober, C. et al. The Collaborative Study on the Genetics of Asthma. Hum Mol Genet. 7, 1393-8 (1998)) in the first stage families but not in the second. Linkage to 13q14 has also been observed to house dust mite allergy in children with asthma (Hizawa, N. et al. Collaborative Study on the Genetics of Asthma (CSGA). J Allergy Clin Immunol 102, p 436-42 (1998)), and to children with atopic dermatitis (Beyer K, W. U et al J Allergy Clin Immunol 101, 152 (1998)). These results suggest that chromosome 13 contains an important atopy locus. A locus for atopic dermatitis has recently been mapped to the same region of chromosome 13-13q14. Susceptibility loci for atopic dermatitis on chromosomes 3, 13, 15, 17 and 18 in a Swedish population: Bradley M, Soderhall C, Luthman H, Wahlgren CF, Kockum I, Nordenskjold M Hum Mol Genet. 2002 Jun. 15; 11(13):1539-48.

Close localisation of disease causing genes may be accomplished by the detection of associations between particular alleles and the disease phenotype. Over short segments of DNA, distinctive alleles of the individual polymorphisms will show non-random association with alleles of neighbouring polymorphisms. This phenomenon, known as “linkage disequilibrium” typically occurs over 50-500 Kilobases (Kb) of DNA (Jorde, L. B. et al. Am J Hum Genet. 54, 884-98 (1994); Collins, A., Lonjou, C. & Morton, N. E. Proc Natl Acad Sci U S A 96, 15173-7 (1999) and Abecasis, G. R. et al. Am J Hum Genet 68, 191-197 (2001)), and associations between polymorphism and disease are in general unlikely to extend beyond 500 Kb. Linkage disequilibrium may be detected by the study of individuals and by the study of families.

Disease causing alleles will be in linkage disequilibrium with non-functional polymorphisms from the same chromosomal segment. It is therefore possible to detect allelic association with disease from particular chromosomal segments, without identifying the exact polymorphism and gene underlying the disease state.

The detection of allelic association may therefore give information as to disease susceptibility in a particular individual. Furthermore, allelic association is indicative of a disease-causing gene being present within a limited distance of DNA in either direction from the allele.

Identification of the disease causing gene will allow the identification of children at risk of atopy before the disease has developed (for example immediately after birth), with the potential for prevention of disease. Knowledge of the gene and its activity will enable predictions to be made regarding the type of disease (i.e. asthma, dermatitis or allergies) and the clinical course of disease (e.g. severe as opposed to mild) or the response to particular treatments. This diagnostic information will be of use to the health care, pharmaceutical and insurance industries.

According to a first aspect of the invention there is provided an isolated or recombinant nucleic acid sequence comprising a sequence as shown in FIG. 5, or a sequence which excludes one or more of the exons as set out in FIG. 3 a or a sequence complementary or substantially homologous thereto, or a fragment thereof. The sequence of FIG. 5 comprises the human ANGE, CLLD7 and CLLD8 nucleotide sequences. FIG. 5 a (i) shows the Exon sequences of the ANGE gene including the 2 alternative first exons; FIG. 5 a (ii) shows the ANGE mRNA sequence and 5 a (iii) the translated protein sequence. The NY-REN-34 mRNA sequence is shown in FIG. 5 b (i) and the protein sequence of NY-REN-34 in FIG. 5 b (ii) an alternative NY-REN-34 protein sequence is shown in FIG. 5 b (iii).

For the purposes of the present invention, the ANGE gene is the gene known in the prior art as NY-REN-34, and shown as nucleotides 313649-346509 of BAC bA103J18.03548 (FIG. 5). References to the ANGE gene in the present application include variant sequences showing 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 915, 92%, 93%, 95%, 96%, 97%, 98%, 99% or 100% homology with the ANGE gene of FIG. 5, and preferably sharing one or more functional characteristics with the ANGE gene.

The ANGE nucleic acid sequence can comprise any combination of one or more exons from FIGS. 3 a, 5 a (i) and Table 5 or a sequence substantially homologous thereto, or a fragment thereof. These combinations are for all ANGE sequences including human and mouse.

All the sequences of the present invention are isolated, or alternatively may be recombinant. By isolated is meant a nucleic acid or polypeptide sequence which has been purified, and is substantially free of other protein and nucleic acid. Such sequences may be obtained by PCR amplification, cloning techniques, or synthesis on a synthesiser. By recombinant is meant nucleic acid sequences which have been recombined by the hand of man.

The polynucleotide sequences of the invention may be genomic or cDNA, or RNA, preferably mRNA, or PNA or other nucleic acid analogue known to the person skilled in the art. In the present invention, gene products include polynucleotide sequences and protein. References to polypeptide sequences include proteins and peptides.

The public domain REFSEQ entries for the mRNA sequences of CLLD7, CLLD8 and NY-REN-34 are NM_(—)018191.2, NM_(—)031915.1 and NM_(—)016119.1 respectively. These show minor differences at the nucleotide level to the sequences shown above. However for NY-REN-34 these alterations result in a truncated putative protein compared to our sequence which is shown below.

In the present application, sequences which are complementary or substantially homologous are those sequences which hybridise under stringent conditions to the defined sequence or its gene products. Thus, for example, a nucleic acid sequence substantially homologous to a reference nucleic acid will be capable of hybridising to a gene product (i.e. mRNA) of the reference nucleic acid, under stringent conditions. A complementary sequence is one which is capable of hybridising to the nucleic acid sequence itself, under stringent conditions. Also provided in the present invention are complements of the substantially homologous sequences. A substantially homologous sequence preferably has at least 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98 or 100% sequence identity with the defined sequence. This definition of substantially homologous applies to both nucleic acid and polypeptide sequences. Thus, polypeptide sequences having conservative amino acid substitutions that do not affect structure or function are also included. For any given DNA sequence, references to a complementary sequence include the corresponding mRNA sequence and any cDNA sequence derived on such an RNA sequence.

“% identity” is a measure of the relationship between two nucleic acid or polypeptide sequences, as determined by comparing their sequences. In general, the two sequences to be compared are aligned to give a maximum correlation between the sequences. The alignment of the two sequences is examined and the number of positions giving an exact amino acid or nucleotide correspondence is determined, and divided by the total length of the alignment, and the result is multiplied by 100 to give a % identity. The % identity may be determined over the whole length of the sequence to be compared, which is particularly suitable for sequences of the same or similar lengths or for sequences which are highly homologous, or over shorter defined lengths which is more suitable for sequences of unequal lengths and with a lower homology.

Methods for comparing the identity of two or more sequences are known in the art. For example, programs available in the Wisconsin Sequence Analysis Package version 9.1 (Devereux J et al., Nucl Acid Res 12 387-395 (1984), available from Genetics Computer Group, Madison, Wis., USA), such as BESTFIT and GAP may be used.

BESTFIT uses the “local homology” algorithm of Smith and Waterman (Advances in Applied Mathematics, 2:482-489, 1981) and finds the best single region of similarity between two sequences. BESTFIT is more suited to comparing two polynucleotide or two polypeptide sequences which are dissimilar in length, the program assuming that the shorter sequence represents a portion of the longer. In comparison, GAP aligns two sequences finding a “maximum similarity” according to the algorithm of Neddleman and Wunsch (J. Mol. Biol. 48:443-354, 1970). GAP is more suited to comparing sequences which are approximately the same length and an alignment is expected over the entire length. Preferably, the parameters “Gap Weight” and “Length Weight” used in each program are 50 and 3 for polynucleotide sequences and 12 and 4 for polypeptide sequences, respectively. Preferably, % identities and similarities are determined when the two sequences being compared are optimally aligned.

Other programs for determining identity and/or similarity between sequences are also known in the art, for instance the BLAST family of programs (Altschul et al, J. Mol. Biol., 215:403-410, (1990) and Altschul et al, Nuc Acids Res., 25:289-3402 (1997), available from the National Center for Biotechnology Information (NCB), Bethesda, Md., USA and accessible through the home page of the NCBI at www.ncbi.nlm.nih.gov) and FASTA (Pearson W. R. and Lipman D. J., Proc. Nat. Acac. Sci., USA, 85:2444-2448 (1988), available as part of the Wisconsin Sequence Analysis Package). Preferably, the BLOSUM62 amino acid substitution matrix (Henikoff S, and Henikoff J. G., Proc. Nat. Acad. Sci., USA, 89:10915-10919, (1992)) is used in polypeptide sequence comparisons including where nucleotide sequences are first translated into amino acid sequences before comparison.

Preferably, the program BESTFIT is used to determine the % identity of a query polynucleotide or a polypeptide sequence with respect to a polynucleotide or a polypeptide sequence of the present invention, the query and the reference sequence being optimally aligned and the parameters of the program set at the default value.

In relation to the present invention, “stringent conditions” refers to the washing conditions used in a hybridisation protocol. In general, the washing conditions should be a combination of temperature and salt concentration so that the denaturation temperature is approximately 5 to 20° C. below the calculated T_(m) of the nucleic acid under study. The T_(m) of a nucleic acid probe of 20 bases or less is calculated under standard conditions (1M NaCl) as [4° C.×(G+C)+2° C.×(A+T)], according to Wallace rules for short oligonucleotides. For longer DNA fragments, the nearest neighbour method, which combines solid thermodynamics and experimental data may be used, according to the principles set out in Breslauer et al., PNAS 83: 3746-3750 (1986). The optimum salt and temperature conditions for hybridisation may be readily determined in preliminary experiments in which DNA samples immobilised on filters are hybridised to the probe of interest and then washed under conditions of different stringencies. While the conditions for PCR may differ from the standard conditions, the T_(m) may be used as a guide for the expected relative stability of the primers. For short primers of approximately 14 nucleotides, low annealing temperatures of around 44° C. to 50° C. are used. The temperature may be higher depending upon the base composition of the primer sequence used. Suitably stringent conditions are those under which non-specific hybridisation (e.g. to non-DPP10 encoding sequences) are avoided. Suitable stringent conditions are 0.5×SSC/1% SDS/58° C./30 mins for a 21mer oligonucleotide probe.

The complementary sequences of the invention (which may also be referred to herein as “antisense”) may be useful as probes or primers, or in the regulation of ANGE expression. Preferably, the primer sequences are capable of amplifying all or a portion of an ANGE gene. Preferred primer sequences are disclosed in the Examples and Table 4. Pairs of primers for amplification of all or part of the gene, or alleles, or variants thereof, form another aspect of the invention. Similarly, ANGE probes will be useful in detecting the presence or expression levels of ANGE, or variant forms thereof, in a sample from a subject. The probes may also be useful in analysing the expression pattern of ANGE in a subject.

In the present application, fragments are any contiguous 10 residue sequence, or greater, such as 20, 30, 40, or 50 residue sequence. Preferably, fragments of nucleic acid or polypeptide sequences share one or more functional characteristics with ANGE or its gene, or are capable of modulating (i.e. inhibiting or enhancing) such a functional characteristic. The novelty of a fragment according to the present embodiment may be easily ascertained by comparing the nucleotide or polypeptide sequence of the fragment with sequences catalogued in databases such as Genebank at the priority date, or by using computer programs such as DNASIS (Hitachi Engineering Inc) or Word Search or FASTA of the Genetic Computer Group (Madison, USA).

The fragments may be used in a variety of diagnostic, prognostic or therapeutic methods or may be useful as research tools for example in screening. Fragments of the sequences of the first aspect or their complements may be used as primer sequences as described above.

In a second aspect of the invention there is provided an isolated or recombinant nucleic acid sequence comprising a sequence as shown in FIG. 5 or a sequence as shown in FIG. 5 which excludes one or more of the exon sequences as set out in FIG. 5 d (i) and Table 5 or a sequence complementary or substantially homologous thereto or a fragment thereof. The CLLD8 mRNA sequence is shown in FIG. 5 d (ii) and the CLLD8 protein sequence in FIG. 5 d (iii). The nucleotides 294727 to 309803 of FIG. 5 is the human CLLD8 nucleic acid sequence.

In a third aspect of the invention there is provided an isolated or recombinant nucleic acid sequence comprising a sequence as shown in FIG. 5 or a sequence as shown in FIG. 5 which excludes one or more of the exon sequences as set out in FIG. 5 c (i) and Table. The CLLD7 mRNA sequence is shown in FIG. 5 c (ii) and the CLLD7 protein sequence in FIG. 5 c (iii). The nucleotides 349634 to 410846 of FIG. 5 is the human CLLD7 nucleic acid sequence.

In a fourth aspect of the invention there is provided an isolated or recombinant nucleic acid sequence as claimed in claim 1 comprising a sequence as shown in FIG. 5 (nucleotides 313649-346509) (ANGE) contiguous with an isolated or recombinant nucleic acid sequence as claimed in claim 2 comprising a sequence as shown in FIG. 5 (nucleotides 294727-309803) (CLLD8) or a sequence complementary or substantially homologous thereto or a fragment thereof.

Alternatively, there is provided an isolated or recombinant polynucleotide sequence comprising the CLLD8 gene and ANGE gene, wherein both genes are under the control of a single regulatory element. Preferably, the regulatory element is a promoter. More preferably, the regulatory element is the CLLD8 promoter.

Alternatively, there is provided an isolated or recombinant polynucleotide sequence encoding a protein having the domain structure Pre-SET-SET-Post-SET-PHD-PHD. Preferably, the domain structure is CpGBD-Pre-SET-SET-Post-SET-PHD-PHD. SET and PHD domains will be known to persons skilled in the art.

Preferably, the polynucleotide sequences of the fourth aspect encode a single gene product comprising both CLLD8 and ANGE. This composite gene product is produced as a splice product of the CLLD8 gene, under control of the CLLD8 promoter, and comprises the ANGE gene product. This composite CLLD8-ANGE gene product results from splicing together of the CLLD8 and ANGE genes and is shown to be involved in atopy.

In a fifth aspect of the invention there is provided an isolated or recombinant nucleic acid sequence as claimed in claim 1 comprising a sequence (nucleotides 313649-346509) as shown in FIG. 5 (ANGE) contiguous with an isolated or recombinant nucleic acid sequence as claimed in claim 3 comprising a sequence (nucleotides 349634 to 410846) as shown in FIG. 5 (CLLD7) or a sequence complementary or substantially homologous thereto or a fragment thereof.

Alternatively, there is provided an isolated or recombinant polynucleotide sequence comprising the ANGE gene and CLLD7 gene, wherein both genes are under the control of a single regulatory element. Preferably, the regulatory element is a promoter. More preferably, the regulatory element is the ANGE promoter.

Preferably, the polynucleotide sequences of the fifth aspect encode a single gene product comprising both ANGE and CLLD7. This composite gene product is produced as a splice product of the ANGE gene, under control of the ANGE promoter, and comprises the CLLD7 gene product. This composite CLLD7-ANGE gene product results from splicing together of the CLLD7 and ANGE genes and is shown to be involved in atopy.

In a sixth aspect of the invention there is provided an isolated or recombinant nucleic acid sequence comprising a sequence as claimed in claim 2 comprising a sequence (nucleotides 294727-309803) as shown in FIG. 5 (CLLD8) contiguous with an isolated or recombinant nucleic acid sequence as claimed in claim 3 comprising a sequence (nucleotides 349634 to 410846) as shown in FIG. 5 (CLLD7) or a sequence complementary or substantially homologous thereto or a fragment thereof.

Alternatively, there is provided an isolated or recombinant polynucleotide sequence comprising the CLLD8 gene and CLLD7 gene, wherein both genes are under the control of a single regulatory element. Preferably, the regulatory element is a promoter. More preferably, the regulatory element is the CLLD7 promoter.

Preferably, the polynucleotide sequences of the sixth aspect encode a single gene product comprising both CLLD8 and CLLD7. This composite gene product is produced as a splice product of the CLLD8 gene, under control of the CLLD8 promoter, and comprises the CLLD7 gene product. This composite CLLD8-CLLD7 gene product results from splicing together of the CLLD8 and CLLD7 genes and is shown to be involved in atopy.

In a seventh aspect of the invention there is provided an isolated or recombinant nucleic acid sequence comprising a sequence as claimed in claim 1 comprising a sequence (nucleotides 313649-346509) as shown in FIG. 5 (ANGE) contiguous with an isolated or recombinant nucleic acid sequence as claimed in claim 2 comprising a sequence (nucleotides 294727-309803) as shown in FIG. 5 (CLLD8) and contiguous with an isolated or recombinant nucleic acid sequence as claimed in claim 3 comprising a sequence (nucleotides 349634 to 410846) as shown in FIG. 5 (CLLD7) or a sequence complementary or substantially homologous thereto or a fragment thereof.

Alternatively, there is provided an isolated or recombinant polynucleotide sequence comprising the ANGE gene, the CLLD8 gene and the CLLD7 gene, wherein all the genes are under the control of a single regulatory element. Preferably, the regulatory element is a promoter. More preferably, the regulatory element is the CLLD8 promoter.

Preferably, the polynucleotide sequences of the seventh aspect encode a single gene product comprising CLLD8, ANGE and CLLD7. This composite gene product is produced as a splice product of the CLLD8 gene, under control of the CLLD8 promoter, and comprises the ANGE and CLLD7 gene products. This composite CLLD8-ANGE-CLLD7 gene product results from splicing together of the CLLD8, ANGE and CLLD7 genes and is shown to be involved in atopy.

By the terms “ANGE” “CLLD7” and “CLLD8” are meant either the complete gene product, or a part or parts thereof. Parts of the gene products are preferably splice variants, and preferably include at least one exon or a transcript produced from at least one exon. Thus, the gene product of the fourth to seventh aspects include at least one exon or part of an exon of CLLD7, CLLD8 and ANGE.

In an eighth aspect of the invention there is provided an isolated or recombinant nucleic acid sequence comprising at least a part of the sequence of FIG. 5, and comprising one or more SNPs at positions, which correspond to the positions of FIG. 5 listed in Table 1.

Particular isolated nucleic acid molecules include those:

as shown in Table 2c; comprising a SNP at the position corresponding to position 185752b4_(—)2 of FIG. 5; comprising a SNP at the position corresponding to position 185752b5_(—)3 of FIG. 5; comprising a SNP at the position corresponding to position 432101b38_(—)1 of FIG. 5.

The isolated or recombinant nucleic acid molecules of the eighth aspect of the present invention are different to the “wild type” or “reference” sequence of FIG. 5.

This aspect of the invention also provides antisense sequences. Such sequences are typically single stranded and are capable of hybridising to the above mentioned nucleic acid sequences of the invention, or to the sequence of FIG. 5, under stringent conditions. Preferred antisense sequences are those which are capable of hybridising to an allele of a polymorphism of the invention, and most preferably is capable of distinguishing between alleles of a polymorphism (of Table 1). Stringent conditions are defined below. The antisense sequences may be prepared synthetically or by nick translation, and are preferably isolated or recombinant.

The antisense sequences include primers and probes, for example, for use in the methods of the present invention. Primer sequences are capable of acting as an initiation site for template directed nucleic acid synthesis, under appropriate conditions, which will be known to skilled persons. Probes are useful in the detection, identification and isolation of particular nucleic acid sequences. Probes and primers are preferably 15 to 30 nucleotides in length.

For amplification purposes, pairs and primers are provided. These include a 5′ primer, which hybridises to the 5′ end of the nucleic acid sequence to be amplified, and a 3′ primer, which hybridises to the complementary strand of the 3′ end of the nucleic acid to be amplified. Preferred primers are those listed in Table 4.

Probes and primers may be labelled, for example to enable their detection. Suitable labels include for example, a radiolabel, enzyme label, fluoro-label, and biotin-avidin label for subsequent visualisation in, for example, a southern blot procedure. A labelled probe or primer may be reacted with a sample DNA or RNA, and the areas of the DNA or RNA which carry complementary sequences will hybridise to the probe, and become labelled themselves. The labelled areas may be visualised, for example by autoradiography.

Preferably, the probes and/or primers hybridise under, “stringent conditions”, which refers to the washing conditions used in a hybridisation protocol. The hybridisation conditions for probes are preferably sufficiently stringent to allow distinction between different alleles of a polymorphism upon binding of the probes. In general, the washing conditions should be combination of temperature and salt concentration so that the denaturation temperature is approximately 5 to 20° C. below the calculated T_(m) of the nucleic acid under study. The T_(m) of a nucleic acid probe of 20 bases or less is calculated under standard conditions (1M NaCl) as [4° C.×(G+C)+2° C.×(A+T)], according to Wallace rules for short oligonucleotides. For longer DNA fragments, the nearest neighbour method, which combines solid thermodynamics and experimental data may be used, according to the principles set out in Breslauer et al., PNAS 83: 3746-3750 (1986). The optimum salt and temperature conditions for hybridisation may be readily determined in preliminary experiments in which DNA samples immobilised on filters are hybridised to the probe of interest and then washed under conditions of different stringencies. While the conditions for PCR may differ from the standard conditions, the T_(m) may be used as a guide for the expected relative stability of the primers. For short primers of approximately 14 nucleotides, low annealing temperatures of around 44° C. to 50° C. are used. The temperature may be higher depending upon the base composition of the primer sequence used. Typically, the salt concentration is no more than 1M, and the temperature is at least 25° C. Suitable conditions are 5×SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA pH 7.4) and a temperature of 25-30° C.

The use of one or more of the SNP markers of Table 1 in the identification of a gene or genetic element which influences IgE mediated disease or non-atopic asthma is also included. As is the use of one or more of the SNP markers in medicine or in the identification of an agent for use in the diagnosis, prevention or treatment of an IgE mediated disease or non-atopic asthma.

In a preferred embodiment the SNP marker is as shown in Table 2c, or 185752b4_(—)2, 185752b5_(—)3, 432103b43_(—)1 or 432101b38_(—)1 or any SNP in linkage disequilibrium with the SNP markers selected in Table 1. The polynucleotides of the first to eight aspects are used in the diagnosis of individuals having an IgE mediated disease or non-atopic asthma, or in the treatment of individuals having such disease. The polynucleotides may also be used in the manufacture of a diagnostic for diagnosing individuals having an IgE mediated disease or non-atopic asthma or in the treatment of individuals having such diseases.

In a ninth aspect of the invention, the isolated nucleic acid sequences of the invention may be provided in the form of a vector to enable the in vitro or in vivo expression of the isolated nucleic acid sequences of any of the first to eighth aspects. Vectors include plasmids, chromosomes, artificial chromosomes and viruses and may be expression vectors, which are capable of expressing nucleic acid sequences in vitro or in vivo, or transformation vectors which are capable of transferring the nucleic acid sequence from one environment to another. The nucleic acid molecules of the invention may be operably linked to one or more regulatory elements including a promoter.

The term regulatory elements includes response elements, consensus sites, methylation sites, locus control regions, post-transcriptional modifications, splice variants, homeoboxes, inducible factors, DNA binding domains, enhancer sequences, initiation codons, secretion signals and, polyA sequences. Regions upstream or downstream of a promoter such as enhancers, which regulate the activity of the promoter, are also regulatory elements.

The vector may also comprise an origin of replication; appropriate restriction sites to enable cloning of inserts adjacent to the polynucleotide molecule; markers, for example antibiotic resistance genes; ribosome binding sites: RNA splice sites and transcription termination regions; polymerisation sites; or any other element, such a secretion signals, which may facilitate the cloning and/or expression of the polynucleotide molecule.

Within a vector the gene may be expressed upstream or downstream of an expressed protein tag such as a histidine tag, V5 epitope tag, green fluorescent protein tag, MHC tag or other such tag known to those skilled in the art. Use of such a tag allows easy localisation, affinity purification and detection of the fusion protein with an antibody to the tag moiety.

Where two or more nucleic acid molecules of the invention are introduced into the same vector, each may be controlled by its own regulatory sequences, or all molecules may be controlled by the same regulatory sequence. In the same manner, each molecule may comprise a 3′ polyadenylation site. Examples of suitable vectors will be known to persons skilled in the art and include pBluescript II, lambdaZap, and pCMV-Script (Stratagene Cloning Systems, La Jolla, USA).

Appropriate regulatory elements, in particular promoters, will usually depend upon the host cell into which the expression vector is to be inserted. Where microbial host cells are used, promoters such as lactose promoter system, tryptophan (Trp) promoter system, β-lactamase promoter system or phage lambda promoter systems are suitable. Where yeast cells are used, preferred promoters include alcohol dehydrogenase I or glycolytic promoters. In mammalian host cells, preferred promoters are those derived from immunoglobulin genes, SV40, Adenovirus, Bovine Papilloma virus etc. Suitable promoters for use in various host cells would be readily apparent to a person skilled in the art (See, for example, Current Protocols in Molecular Biology Edited by Ausubel et al, published by Wiley). In addition, the regulatory elements may be modified, for example by the addition of further regulatory elements, to achieve a desired expression pattern.

By operably linked is meant that the components of the vector or sequence are in a relationship which allows them to function as intended.

These vectors may be used to transform host cells, for example, prokaryotic or eukaryotic cells. These cells may be used in the production of recombinant gene products produced from the isolated nucleic acid sequences of the first to eighth aspects, or in the regulation or analysis of the nucleic acid sequences of the first to eighth aspects. The transformed host cells form part of the invention. Preferred cells include E. coli, yeast, filamentous fungi, insect cells, mammalian cells, preferably immortalised, such as mouse, CHO, HeLa, Myeloma or Jurkat cell lines, human and monkey cell lines and derivatives thereof.

According to a tenth aspect of the invention, there is provided a polypeptide sequence comprising a polypeptide sequence encoded by a nucleic acid sequence of the first to eighth aspects of the invention. Preferably the polypeptide sequences are encoded by a nucleic acid sequence of FIG. 5.

The tenth aspect of the invention includes a polypeptide sequence comprising a polypeptide sequence as shown in any one of FIGS. 5 a (iii), 5 a (v), 5 b (ii), 5 b (iii), 5 c (iii), 5 d (iii) or a sequence homologous thereto, or a fragment thereof. The sequences of FIGS. 5 a (iii), 5 a (v), 5 b (ii), 5 b (iii), 5 c (iii), 5 d (iii) are the predicted human ANGE, CLLD7 and CLLD8 polypeptide sequences respectively.

The ANGE, CLLD7 and CLLD8 polypeptides or sequences substantially homologous thereto or a fragments thereof may be subject to post-translational modification. Post-translational modification (PTM) is defined herein as including modification of a protein following translation by proteolytic cleavage e.g. cleavage of a preprotein, a proprotein or a preproprotein by removal of a signal sequence or activation of a zymogen. PIM also includes the attachment of a carbohydrate to a protein, the predominant sugars attached include glucose, galactose, mannose, fucose, GalNAC, GlcNAC and NANA. The carbohydrates may be linked to the protein either by β-glycosidic or N-glycosidic bonds e.g. glycosylation. Also included are acylation; methylation; phosphorylation; sulfation and prenylation. Vitamin C-dependent modifications such as proline and lysine hydroxylation and carboxy terminal amidation and vitamin K-dependent modifications such as carboxylation of glutamine residues are also included as is the addition of selenium as selenocysteine in a protein.

The ANGE, CLLD7 and CLLD8 polypeptides may be operably linked to a secretion signal, to assist their secretion from the golgi apparatus to another part of the cell. Suitable secretion signals can be provided by recombinant vectors such as pSecTag2 (Invitrogen Corporation, Carlsbad, Calif.). Proteins expressed from such vectors are fused at the N-terminus to the murine Ig kappa chain leader sequence. The secretion signal may be linked to the soluble ANGE, CLLD7 and CLLD8 polypeptide sequences using techniques available in the art, including recombinant DNA technology. The polypeptides may be linked to a tag such as a histidine tag, V5 epitope tag, green fluorescent protein tag, MHC tag or other tag known to those skilled in the art or to a carrier molecule known to a person skilled in the art.

The polypeptide sequences of the tenth aspect are preferably functional and may be useful in drug screening, diagnosis or therapy. Functional fragments of ANGE, CLLD7 or CLLD8 are those which share immunological or functional characteristics with the full length, membrane bound or soluble form of ANGE, CLLD7 or CLLD8. Fragments may be at least 10, preferably 15, 20, 25, 30, 35, 40 or 50 amino acids in length. Preferably, the polypeptide sequences are isolated.

In an eleventh aspect of the present invention, there are provided antibodies which are specific for an antigen of a polypeptide sequence of the tenth aspect or an antigen of the isolated nucleic acid of the first to eighth aspects, or fragments of any of said aspects or which react with an antigen of a polypeptide sequence of the tenth aspect or the isolated nucleic acid of the first to eighth aspects, or fragments of any of said aspects. Herein the term “react” has the meaning that the antibody is able to interact with the polypeptide or isolated nucleic acid. The term “specific for” has the meaning that the antibody specifically reacts with the polypeptide or isolated nucleic acid.

Antibodies can be made by the procedure set forth by standard procedures (Harlow and Lane, “Antibodies; A Laboratory manual” Cold Spring Harbour Laboratory, Cold Spring Harbour, N.Y., 1998). Briefly, purified antigen can be injected into an animal in an amount and in intervals sufficient to elicit an immune response. Antibodies can either be purified directly, or spleen cells can be obtained from the animal. The cells are then fused with an immortal cell line and screened for antibody secretion. The antibodies can be used to screen DNA clone libraries for cells secreting the antigen. Those positive clones can then be sequenced as described in, for example, Kelly et al., Bio/Technology 10:163-167 (1992) and Bebbington et al., Bio/Technology 10:169-175 (1992). Preferably, the antigen being detected and/or used to generate a particular antibody will include polypeptide sequences according to the tenth aspect or isolated nucleic acid sequences according to the first to eighth aspects. The antibody may be a polyclonal or monoclonal antibody, a chimeric antibody, a humanised antibody or a bifunctional antibody or a fragment of any of the above. A bifunctional antibody is an antibody that can bind to two different antigens, these antigens may be different antigens present in the ANGE, CLLD7 or CLLD8 polypeptides or isolated nucleic acid or may be an antigen of ANGE, CLLD7 or CLLD8 combined with e.g. a cellular antigen.

In a preferred embodiment polyclonal antibodies are raised against peptide fragments as shown in Table 5.

In particular, the antibody may be raised against a particular domain of ANGE, CLLD7 or CLLD8. Such antibodies will be useful in diagnostic and therapeutic aspects of the invention. In particular, the antibodies will be useful in the development of assays for detecting or measuring ANGE, CLLD7 or CLLD8 individually or as spliced hybrids in a sample.

According to a twelfth aspect of the invention, there is provided a process for the preparation of a nucleic acid sequence as defined above, the process comprising ligating together successive nucleotide and/or oligonucleotide residues. Such a process may be carried out using chemical synthesis methods or by using enzymic catalysis. Alternatively, a suitable host cell may be transfected with an appropriate DNA or RNA sequence so as to cause production of the desired sequence in a host cell.

In a thirteenth aspect of the invention, there is provided a process for the preparation of a polypeptide as defined above, the process comprising ligating together successive amino acids and/or oligopeptides. Such a process may be carried out using chemical synthesis methods or by using enzymic catalysis. Alternatively, a suitable host cell may be transfected with an appropriate DNA or RNA sequence so as to cause production of the desired polypeptide in a host cell. The polypeptide may be produced in a cell free system.

In a fourteenth aspect, there is provided a host cell comprising a vector or isolated or recombinant nucleic acid molecule according to the aforementioned aspects. The host cell may comprise an expression vector, or naked DNA encoding the nucleic acid molecules of the invention. A wide variety of suitable host cells are available, both eukaryotic and prokaryotic. Examples include bacteria such as E. coli, yeast, filamentous fungi, insect cells, mammalian cells, preferably immortalised, such as mouse, CHO, HeLa, myeloma or Jurkat cell lines, human and monkey cell lines and derivatives thereof. The host cells are preferably capable of expression of the nucleic acid sequence to produce a gene product (i.e. RNA or protein). Such host cells are useful in drug screening systems to regulate or analyse the polypeptides of the tenth aspect or to identify agents for use in diagnosis or treatment of individuals having, or being susceptible to disease.

The method by which said nucleic acid molecules are introduced into a host cell will usually depend upon the nature of both the vector/DNA and the target cell, and will include those known to a person skilled in the art. Suitable known methods include but are not limited to fusion, conjugation, liposomes, immunoliposomes, lipofectin, transfection, transduction, electroporation or injection, as described in Sambrook et al.

In a fifteenth aspect of the present invention, there is provided a transgenic non-human animal comprising a nucleic acid sequence according to an aforementioned aspect of the invention. Such transgenic non-human animals are useful for the analysis of single nucleotide polymorphisms and their phenotypic effect and so for the analysis of the ANGE, CLLD7 and CLLD8 gene cluster and its phenotypic effect. Expression of a polynucleotide sequence of the invention in a transgenic non-human animal is usually achieved by operably linking the polynucleotide to a promoter and/or enhancer sequence, preferably to produce a vector of the above aspect, and introducing this into an embryonic stem cell of a host animal by microinjection techniques (Hogan et al., A Laboratory Manual, Cold Spring harbour and Capecchi Science (1989) 244: 1288-1292). The transgene construct should then undergo homologous recombination with the endogenous gene of the host. Those embryonic stem cells comprising the desired nucleic acid sequence may be selected, usually by monitoring expression of a marker gene, and used to generate a non-human transgenic animal. Preferred host animals include mice, rabbits and other rodents.

The nucleic acid sequence introduced may not be native to the host animal, i.e. it may be foreign. Such transgenic animals may be distinguished from native, non-transgenic animals using methods known in the art, for example a nucleic acid sample from the transgenic animal may be compared with that from a native animal—the transgenic animal will have a nucleic acid sequence such as a foreign promoter, marker genes etc. Alternatively, the phenotypes of the animals can be compared.

Where it is desirable to use the transgenic non-human animal of the fifteenth aspect to study disease, it may be desirable for the nucleic acid introduced into the animal to encode a variant of ANGE, CLLD7 or CLLD8 which results in asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis. A transgenic non-human animal may be produced that no longer expresses a native ANGE, CLLD7 or CLLD8 gene or any combination of these genes or any particular splice variant of the genes. These animals may be referred to as “knock-out” (Manipulating The Mouse Embryo—A Laboratory Manual, Hogan et al 1986). In some cases, it may be desirable to modulate the expression of the foreign nucleic acid and/or the native gene in a temporal or spatial manner. This approach removes viability problems if the expression of the native gene is abolished in all tissues.

In a most preferred embodiment, there is provided a transgenic mouse comprising a nucleic acid encoding a variant form of ANGE, CLLD7 or CLLD8 or any combination of these genes or any splice variant of the genes which causes asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis or non-atopic asthma. Most preferably, the nucleic acid molecule comprises a SNP at the position which corresponds one or more as shown in Table 2c, or to Position 185752b4_(—)2, Position 185752b5_(—)3 and/or Position 432101b38_(—)1 of FIG. 5.

Preferably, the mouse is modulated so that it no longer expresses the ANGE, CLLD7 or the CLLD8 gene or any combination of two or more of these genes or any splice variant of the genes in a temporally and/or spatially appropriate manner using homologous recombination techniques or alternatively to over express the ANGE, CLLD7 or CLLD8 gene or any combination of two or more of these genes or any splice variant of the genes to produce a protein as a result of transgenic manipulation.

If a functional polymorphism as shown in Table 4c in the ANGE e.g. ANGE1X3C148T, CLLD7 e.g. CL0703 or CLLD8 gene or any combination of two or more of these genes or any splice variant of the genes is identified (i.e. a “mutation”) a construct containing this polymorphism can be introduced into the mouse germ line (i.e. a knock-in) to produce a pathological variant of a protein rather than knocking it out. Alternatively a pathological variant of the ANGE, CLLD7 or CLLD8 gene or any combination of these genes or any splice variant of the genes may be overexpressed.

In the context of the present invention, atopic diseases include those resulting from overexpression of the ANGE, CLLD7 or CLLD8 gene or any combination of these genes or any splice variant of the genes, or the presence of a variant form of the ANGE, CLLD7 or CLLD8 gene or any combination of these variant genes or any splice variant of the variant genes. Specifically, such diseases include asthma (atopic and non-atopic), atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis.

In a sixteenth aspect of the present invention, there is provided a method of diagnosing, or determining predisposition or susceptibility of a subject to atopy or predicting severity of disease in an individual. The method may comprise determining the presence of a variant form of the ANGE, CLLD7 or CLLD8 gene or any combination of these genes or any splice variant of the genes which is known to be associated with a disease state, or measuring the levels of the ANGE, CLLD7 or CLLD8 gene or any combination of these genes or any splice variant of the genes. A variant form of ANGE, CLLD7 or CLLD8 or any combination of at least two of these genes or any splice variant of the genes includes both nucleic acid and amino acid variants. A variant includes any SNP producing an alteration from the wild-type (e.g. for humans) (FIG. 5/Table 1) or other mutation or alteration from the wild-type.

For example, probes or primers as described above may be useful in detecting nucleic acid encoding ANGE, CLLD7 or CLLD8 or any combination of these genes or any splice variant of the genes or a variant thereof. Information regarding the expression pattern or forms of ANGE, CLLD7 or CLLD8 or any combination of these genes or any splice variant of the genes present will be useful in determining whether the individual is susceptible to diseases, resulting from altered expression of ANGE, CLLD7 or CLLD8 or any combination of these genes or any splice variant of the genes.

In a preferred embodiment, the method may additionally, or alternatively, comprise determining the presence or absence of a risk allele which is associated with one or more of the SNP markers of Table 1, where presence of a risk allele is indicative of disease or predisposition to disease or severity of disease. The method may also comprise genotyping one or more known polymorphisms. Any combination of such polymorphisms may be genotyped. Optionally any one or more SNPs in linkage disequilibrium may be used in the method.

The SNPs of the invention are listed in Table 1 where the nature of the polymorphism is described in the format wild type allele/variant allele. The SNPs are positioned with respect to FIG. 5, where nucleotide position 1 is the 1^(st) nucleotide in the FIG. 5.

The alleles for the remaining SNPs identified in the present invention are described in Table 1.

Any technique, including those known to persons skilled in the art, may be used in the above method. These may include the use of probes or primers as described above, or antibodies of the eleventh aspect, for example in ELISA assays or in immunolocalisation. Preferably, the method comprises first removing a sample from a subject. More preferably, the method comprises isolating from a sample a nucleic acid or a polypeptide sequence.

In particular, methods for use in this aspect include those known to persons skilled in the art for identifying differences between nucleic acid sequences, for example direct probing, allele specific hybridisation, PCR methodology including Pyrosequencing (Ahmadian A, Gharizadeh B, Gustafsson AC, Sterky F, Nyren P, Uhlen M, Lundeberg J. Single-nucleotide polymorphism analysis by pyrosequencing, Anal Biochem. 2000 Apr. 10; 280(1):103-10; Nordstrom T, Ronaghi M, Forsberg L, de Faire U, Morgenstern R, Nyren P. Direct analysis of single-nucleotide polymorphism on double-stranded DNA by pyrosequencing. Biotechnol Appl Biochem. 2000 April; 31 (Pt 2):107-12) Allele Specific Amplification (ASA) (WO93/22456), Allele Specific Hybridisation, single base extension (U.S. Pat. No. 4,656,127), ARMS-PCR, Taqman™ (U.S. Pat. Nos. 4,683,202; 4,683,195; and 4965188), oligo ligation assays, single-strand conformational analysis ((SSCP) Orita et al PNAS 86 2766-2770 (1989)), Genetic Bit Analysis (WO 92/15712) and RFLP direct sequencing, mass-spectrometry (MALDI-TOF) and DNA arrays. The appropriate restriction enzyme, will, of course, be dependent upon the polymorphism and restriction site, and will include those known to persons skilled in the art. Analysis of the digested fragments may be performed using any method in the art, for example gel analysis, or southern blots.

There is provided a method of diagnosing, or determining predisposition to disease or severity of disease, comprising determining the presence or absence of an allele of a SNP at e.g. as shown in Table 2c, or at position 185752b4_(—)2, 185752b5_(—)2 and/or 432101b38_(—)1 of FIG. 5, wherein presence of a risk allele is diagnostic of disease or predisposition to disease or severity of disease.

The present invention is advantageous in that it facilitates the accurate diagnosis of disease, or the determination of predisposition to disease or the severity of disease. Thus, by genotyping, an individual may be identified as having or being predisposed to disease and the likely severity of the disease. This helps to identify those individuals who are likely to respond positively to particular treatments or preventative measures. Thus, more effective therapies or preventative measures can be administered.

The diseases, which are associated with the polymorphisms of the invention, include atopic diseases, such as asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis and non-atopic asthma. Predisposition to disease in the context of the present invention means that these individuals are at higher risk of developing the disease, or a more severe form of the disease, or a particular form of the disease.

In the context of the present invention, a risk allele is the allele of a polymorphism, which is associated with disease or predisposition to disease. The risk allele may be the wild type or the variant allele, as defined below.

The term “polymorphism” refers to the coexistence of multiple forms of a sequence. Thus, a polymorphic site is the location at which sequence divergence occurs. The different forms of the sequence, which exist as a result of the presence of a polymorphism, are referred to as “alleles”. The region comprising a polymorphic site may be referred to as a polymorphic region.

Examples of the ways in which polymorphisms are manifested include restriction fragment length polymorphisms (Botstein et al Am J Hum Genet. 32 314-331 (1980)), variable number of tandem repeats, hypervariable regions, minisatellites, di- or multi-nucleotide repeats, insertion elements and nucleotide or amino acid deletions, additions or substitutions. A polymorphic site may be as small as one base pair, which may alter a codon thus resulting in a change in the encoded amino acid sequence.

Single nucleotide polymorphisms arise due to the substitution, deletion or insertion of a nucleotide residue at a polymorphic site. Such variations are referred to as SNPs. SNPs may occur in protein coding regions, in which case different polymorphic forms of the sequence may give rise to variant protein sequences. Other SNPs may occur in noncoding regions. In either case, SNPs may result in defective proteins or regulation of genes, thus resulting in disease. Other SNPs may have no phenotypic effects, but may show linkage to disease states, thus serving as markers for disease. SNPs typically occur more frequently throughout the genome than other forms of polymorphism discussed above, and there is therefore a greater probability of finding a SNP associated with a particular disease state.

Linkage disequilibrium is the co-inheritance of two alleles at greater frequencies than would be expected from the separate frequencies of each allele. Conversely, alleles are in linkage equilibrium if they occur together. The expected frequency of two alleles inherited together is the product of the frequency of each allele.

Also provided is a method diagnosing an individual as having abnormal serum IgE levels, the method comprising demonstrating in the individual the presence or absence of an allele which is associated with the SNP marker 185752b4_(—)2 and optionally any other SNP in linkage disequilibrium with the marker and a method for diagnosing an individual as having an STI above 5 mm, the method comprising demonstrating in the individual the presence or absence of an allele which is associated with the SNP marker 432101b38_(—)1 and optionally any other SNP in linkage disequilibrium with the marker.

Where two or more polymorphisms are genotyped, the method preferably defines determining the presence or absence of a haplotype, which is indicative of disease or predisposition to disease. A haplotype is defined herein as a collection of polymorphic sites in a particular sequence that are inherited in a group, i.e. are in linkage disequilibrium with each other. The identification of haplotypes in the diagnosis of disease helps to reduce the possibility of false positives. The haplotype may be any particular combination of polymorphisms of Table 1, optionally in combination with one or more known polymorphisms. A preferred haplotype is the combination of SNPs as shown in Table 2c; or positions 185752b4_(—)2, 185752b5_(—)3 and 432101b38-1_(—)1 of FIG. 5.

A method for diagnosing an individual as being atopic, the method comprising demonstrating in an individual the presence or absence of alleles associated with the haplotype as shown in Table 2c or the haplotype 185752b4_(—)2, 185752b5_(—)3, 4321031b43_(—)1 and optionally any other SNP in linkage disequilibrium with any one of these markers is also provided.

The methods of the sixteenth aspect are preferably carried out on a sample removed from a subject. Any biological sample comprising cells containing nucleic acid, preferably that of FIG. 5, is suitable for this purpose. Examples of suitable samples include whole blood, leukocytes, semen, saliva, tears, buccal, skin or hair. For analysis of DNA, mRNA or protein, the sample must come from a tissue in which the sequence of interest is expressed. Blood is a readily accessible sample. Thus, the method of the sixteenth aspect preferably includes the steps of obtaining a sample from an individual, preparing nucleic acid and/or protein from the sample and analysing the nucleic acid or protein sample for the presence or absence of a particular allele or gene or combination of genes of interest or a particular splice variant. Where nucleic acid is to be analysed, it is preferred that an amplification step be performed prior to analysis. A preferred amplification technique is PCR, although any other suitable methods may be employed. Preferably the method uses a pair of primers which hybridise under stringent conditions to a region either side of a SNP. The primers may include an oligonucleotide sequence as shown in Table 4.

The subject is preferably a mammal, and more preferably a human. The subject may be an infant, a child or an adult. Alternatively, the sample may be obtained from the subject prepartum e.g. by amniocentesis.

A subject's risk factor for disease may be determined with reference also to other known genetic factors, and/or clinical, physiological or dietary factors.

The above described methods may require amplification of the DNA sample from the subject, and this can be done by techniques known in the art, such as PCR (see PCR Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, Freeman Press, NY 1992; PCR Protocols: A Guide to methods and Applications (eds. Innis et al., Academic press, San Diego, Calif. 1990); Mattila et al., Nucleic Acids Res. 19 4967 (1991); Eckert et al., PCR Methods and Applications 117 (1991) and U.S. Pat. No. 4,683,202. Other suitable amplification methods include ligase chain reaction (LCR) (Wu et al., Genomics 4 560 (1989); Landegran et al., Science 241 1077 (1988)), transcription amplification (Kwoh et al., Proc Natl Acad Sci USA 86 1173 (1989)), self sustained sequence replication (Guatelli et al., Proc Natl Acad Sci USA 87 1874 (1990)) and nucleic acid based sequence amplification (NASBA). The latter two methods both involve isothermal reactions based on isothermal transcription which produce both single stranded RNA and double stranded DNA as the amplification products, in a ratio of 30 or 100 to 1, respectively.

Where it is desirable to analyse multiple samples simultaneously, it may be preferable to use arrays as described in WO95/11995. The array may contain a number of probes, each designed to identify variants of the ANGE, CT 7 or CLLD8 genes or any combination of two or more of these genes or any splice variant of the genes from a sample.

Where a restriction enzyme is required, it can be selected according to the nature of the polymorphism and restriction site. Suitable enzymes will be known to persons skilled in the art. Analysis of the digested fragments may be performed using any method in the art, for example gel analysis, or Southern blots.

Determination of an allele of a polymorphism using the above methods typically involves the use of anti-sense sequences i.e. sequences which are complementary to the nucleic acid sequences of interest, which may include part of the sequence of FIG. 5. Such sequences are described in the first to eighth aspects of the invention.

Where it is desirable to identify the presence of multiple single nucleotide polymorphisms, or haplotypes, in a sample from a subject, it may be preferable to use an array. The array may contain a number of probes, each designed to identify one or more of the above single nucleotide polymorphisms of the invention.

An antibody to the ANGE, CLLD7 or CLLD8 genes or any combination of these genes, or the presence or absence of any splice variant of the genes as previously described may be used in the method of the sixteenth aspect. The detection of binding of the antibody to the antigen in a sample may be assisted by methods known in the art, such as the use of a secondary antibody, which binds to the first antibody, or a ligand. Immunoassays including immunofluorescence assays (IFA) and enzyme linked immunosorbent assays (ELISA) and immunoblotting may be used to detect the presence of the antigen. For example, where ELISA is used, the method may comprise binding the antibody to a substrate, contacting the bound antibody with the sample containing the antigen, contacting the above with a second antibody bound to a detectable moiety (typically an enzyme such as horse radish peroxidase or alkaline phosphatase), contacting the above with a substrate for the enzyme, and finally observing the colour change which is indicative of the presence of the antigen in the sample.

Any biological sample comprising cells containing nucleic acid or protein is suitable for this purpose. Examples of suitable samples include whole blood, semen, saliva, tears, buccal, skin or hair. For analysis of cDNA, mRNA or protein, the sample must come from a tissue in which the ANGE, CLLD7 or CLLD8 genes or any combination of two or more of these genes or any splice variant of the genes is expressed. Peripheral blood leukocytes are a readily accessible sample.

In a seventeenth aspect of the invention, there is provided a splice variant of ANGE, CLLD8 or CLLD7 for use in a method of diagnosing an IgE mediated disease, atopy, or a form of atopic disease or non-atopic asthma, or predicting severity of disease, or predisposition to disease.

The splice variant is preferably an RNA, more preferably a mRNA sequence, encoded by the whole or part of the sequence of ANGE or CLLD8 or CLLD7. Transcripts of the ANGE gene and splice variants of the ANGE gene are included as a further aspect of the invention. The splice variants of the seventeenth aspect include at least one exon, or fragment of an exon of ANGE, CLLD8 or CLLD7, or a combination of at least one exon, or fragment of an exon, from at least two of the ANGE, CLLD8 and CLLD7 genes. In particular, the splice variants of ANGE may include transcripts having AB011031 exon 1 or AF155105 exon 1; or comprising at least exon 2; lacking exon 2; comprising at least exon Va or Vb which lies between exons 5 and 6 (FIG. 4G); lacking exon Va or Vb; comprising at least exon 4, 5, 6, 7 and/or 8. The intron/exon map of ANGE is shown in FIG. 3.

In an eighteenth aspect, there are provided the use of a splice variant of ANGE and/or CLLD8 and/or CLLD7 in the manufacture of a diagnostic for use in a method of diagnosing atopy or a form of atopic disease, or predicting severity of disease, or predisposition to disease. Alternatively, the splice variant is provided for use in the manufacture of a medicament for treating disease or for use in a method of treating disease.

In a nineteenth aspect, there is provided a kit comprising a splice variant according to the seventeenth aspect for use in a method according to the sixteenth aspect. Preferably, two or more splice variants are provided, preferably in the form of an array, or on a chip.

In a twentieth aspect, there is provided a polynucleotide sequence comprising the ANGE, CLLD8 or CLLD7 genes, or a polypeptide encoded by the sequence, or fragment thereof, for use in a screen for an agent which inhibits or enhances the activity of ANGE, CLLD8 or CLLD7. Methods of screening for such agents are also provided.

In a twenty-first aspect of the invention there is provided a kit for diagnosis of disease or predisposition to disease, comprising a means for determining the presence or absence of a allele of a SNP of Table 1, wherein the allele is diagnostic of disease, or of predisposition to disease, or of severity of disease.

In a preferred embodiment, the kit comprises a means for determining the presence or absence of one or more risk alleles of polymorphisms according to the eighth aspect. In particular, the kit comprises means for determining the presence or absence of a risk allele of a SNP as shown in Table 2c; or at position 185752b4_(—)2, position 185752b5_(—)3, and/or position 432101b38_(—)1 of FIG. 5.

Preferably the kit will comprise the components necessary to determine the presence or absence of a risk allele of the eighth aspect, in accordance with the sixteenth aspect of the invention. Such components include PCR primers and/or probes, for example those described above, PCR enzymes, restriction enzymes, and DNA or RNA purification means. Preferably, the kit will contain at least one pair of primers, or probes, preferably as described above in accordance with the eighth aspect of the invention. The primers are preferably allele specific primers. Other components include labelling means, buffers for the reactions. In addition, a control nucleic acid sample may be included, which comprises a wild type or variant nucleic acid sequence as defined above, or a PCR product of the same. The kit will usually also comprise instructions for carrying out the diagnostic method, and a key detailing the correlation between the results and the likelihood of disease. The kit may also comprise an agent for the prevention or treatment of disease.

In a twenty-second aspect of the invention, there is provided a method of identifying a compound for treatment of disease, comprising (a) administration of a compound to tissue comprising a nucleic acid molecule comprising one or more SNPs at positions which correspond to a positions of FIG. 5 listed in Table 1; and (b) determining whether the agent modulates an effect of the SNPs.

In a preferred embodiment, the isolated nucleic acid molecule is according to the eighth aspect of the invention, and most preferably comprises a SNP as shown in Table 2c; or at a position corresponding to position 185752b4_(—)2 and/or position 185752b5_(—)3, and/or position 432101b38_(—)1 of FIG. 5.

In this aspect, a nucleic acid molecule of the invention, and/or a cell line according to an aforementioned aspect, may be used to screen for agents, which are capable of modulating the effect of a SNP.

Potential agents are those which react differently with a risk allele and non-risk allele. Putative agents will include those known to persons skilled in the art, and include chemical or biological compounds, sense or anti-sense nucleic acid sequence for example as described above, binding proteins, kinases, and any other gene or gene product, agonist or antagonist. Preferably, the agent will be capable of modulating the effects of the disease causing allele. Most preferably, the agent is one which is capable of ameliorating the deleterious effects of the risk allele.

Such agents may be suitable for either prophylactic administration or after a disease has been diagnosed. The route of administration is suitably chosen according to the disease or condition to be treated, however, typical routes of administration of the agent of the present invention include but are not limited to oral, rectal, intravenous, parenteral, intramuscular and sub-cutaneous routes. The invention also provides for agents to be administered either as DNA or RNA and thus as a form of gene therapy. The agents may be delivered into cells directly by means including but not limited to liposomes, viral vectors and coated particles (gene gun).

In a twenty-third aspect of the present invention there is provided an agent or antibody as described above according to the invention, for use in preventing or treating an IgE mediated disease such as asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis or non-atopic asthma.

There is also provided an agent capable of influencing expression of the ANGE and/or CLLD8 and/or CLLD7 genes for use in a method of treating an IgE-mediated disease e.g. atopy or non-atopic asthma in an individual. Preferably, the agent is capable of influencing the activity of the ANGE and/or CLLD8 and/or CLLD7 gene promoters and/or influencing RNA splicing of the ANGE and/or CLLD8 and/or CLLD7 or ANGE and CLLD7 genes or any combination of two or more of the genes or of any splice variant. Influencing or modulating the activity may include either inhibiting or enhancing, or altering the pattern of activity. Examples of agents according to the twenty-third aspect include but are not limited to proteins, such as transcription factors, which may bind to the ANGE/CLLD8 promoter or splice sites; antibodies or binding partners; ribozymes; and polynucleotide sequences.

Preferred agents for influencing the expression of the genes include polynucleotide sequences, which are complementary to the relevant polynucleotide sequence of FIG. 5. Sequence complementarity can be determined using conventional techniques available in the art. Preferred complementary, or antisense, sequences are those which hybridise under stringent conditions to the genes. Suitably stringent conditions are those under which non-specific hybridisation (e.g. to non-ANGE sequences) are avoided.

In relation to the present invention, “stringent conditions” refers to the washing conditions used in a hybridisation protocol. In general, the washing conditions should be a combination of temperature and salt concentration so that the denaturation temperature is approximately 5 to 20° C. below the calculated T_(m) of the nucleic acid under study. The T_(m) of a nucleic acid probe of 20 bases or less is calculated under standard conditions (1M NaCl) as [4° C.×(G+C)+2° C.×(A+T)], according to Wallace rules for short oligonucleotides. For longer DNA fragments, the nearest neighbour method, which combines solid thermodynamics and experimental data may be used, according to the principles set out in Breslauer et al., PNAS 83: 3746-3750 (1986). The optimum salt and temperature conditions for hybridisation may be readily determined in preliminary experiments in which DNA samples immobilised on filters are hybridised to the probe of interest and then washed under conditions of different stringencies. While the conditions for PCR may differ from the standard conditions, the T_(m) may be used as a guide for the expected relative stability of the primers. For short primers of approximately 14 nucleotides, low annealing temperatures of around 44° C. to 50° C. are used.

The temperature may be higher depending upon the base composition of the primer sequence used.

Antisense sequences which hybridise under stringent conditions to the ANGE or CLLD8 or CLLD7 genes may be useful as primers in any of the aspects of the present invention. Pairs of primers for amplification of all or part of the ANGE, CLLD8 or CLLD7 genes, or alleles, or variants thereof, form another aspect of the invention.

There is also provided the use of an agent or antibody as described above in the manufacture of a medicament for use in preventing or treating an IgE mediated disease such as asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis or non-atopic asthma. The agents of the above aspect, in particular antisense sequences, may also be useful in diagnosing an individual as having atopy.

According to a twenty-fourth aspect of the invention, there is provided, a pharmaceutical composition or medicament comprising a nucleic acid or polypeptide sequence as defined above according to the invention. Alternatively, the pharmaceutical composition may comprise an agent as defined in relation to the above aspect or an antibody according to the eleventh aspect of the invention.

Administration of pharmaceutical compositions is accomplished by any effective route, e.g. orally or parenterally. Methods of parental delivery include topical, intra-arterial, subcutaneous, intramedullary, intravenous, or intranasal administration. Administration can also be effected by amniocentesis-related techniques. Oral administration followed by subcutaneous injection would be the preferred routes of uptake; also long acting immobilisations would be used. In addition to the active ingredients, these pharmaceutical compositions may contain suitable pharmaceutically acceptable carriers comprising excipients and other compounds that facilitate processing of the active compounds into preparations which can be used pharmaceutically. Further details on techniques for formulation and administration may be found in the latest edition of “REMINGTON'S PHARMACEUTICAL SCIENCES” (Maack Publishing Co, Easton Pa.).

Pharmaceutical compositions for oral administration can be formulated using pharmaceutically acceptable carriers well known in the art, in dosages suitable for oral administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, etc., suitable for ingestion by the patient.

Pharmaceutical compositions suitable for use in the present invention include compositions wherein the active ingredients are contained in an effective amount to achieve the intended purpose. Thus, a therapeutically effective amount is an amount sufficient to ameliorate or eradicate the symptoms of the disease being treated. The amount actually administered will be dependent upon the individual to which treatment is to be applied, and will preferably be an optimised amount such that the desired effect is achieved without significant side-effects. The determination of a therapeutically effective dose is well within the capability of those skilled in the art. Of course, the skilled person will realise that divided and partial doses are also within the scope of the invention.

For any compound, the therapeutically effective dose can be estimated initially either in cell culture assays or in any appropriate animal model. These assays should take into account receptor activity as well as downstream processing activity. The animal model is also used to achieve a desirable concentration range and route of administration. Such information can then be used to determine useful doses and routes for administration in humans.

A therapeutically effective amount refers to that amount of agent, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity of such compounds can be determined by standard pharmaceutical procedures, in cell cultures or experimental animals (e.g. ED₅₀, the dose therapeutically effective in 50% of the population; and LD₅₀, the dose lethal to 50% of the population). The dose ratio between therapeutic and toxic effects is the therapeutic index, and it can be expressed as the ratio ED₅₀/LD₅₀. Pharmaceutical compositions, which exhibit large therapeutic indices, are preferred. The data obtained from cell culture assays and animal studies is used in formulating a range of dosage for human use. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage varies within this range depending upon the dosage form employed, sensitivity of the patient, and the route of administration.

The exact dosage is chosen by the individual physician in view of the patient to be treated. Dosage and administration are adjusted to provide sufficient levels of the active moiety or to maintain the desired effect. Additional factors, which may be taken into account, include the severity of the disease state. Long acting pharmaceutical compositions might be administered every 3 to 4 days, every week, or once every two weeks depending on half-life and clearance rate of the particular formulation. Guidance as to particular dosages and methods of delivery is provided in the literature (see, US Pat. Nos. 4,657,760; 5,206,344 and 5,225,212 herein incorporated by reference).

According to a twenty-fifth aspect of the invention, there is provided a method of preventing or treating disease in a subject comprising modulating the activity, expression, half life or post-translational modification of ANGE and/or CLLD7 and/or CLLD8 or any combination of two or more of these genes or any splice variant of the genes in the subject.

In addition, the treatment of individuals having an IgE mediated disease or non-atopic asthma includes prevention of atopy, and prophylactic and therapeutic measures.

Preferably, the method is carried out in a subject who has been diagnosed as suffering from, or is susceptible to IgE mediated diseases such as asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis, or non-atopic asthma.

Preferably, the method comprises determining the presence or absence of a risk allele of a SNP such as one which has an association with IgE mediated disease e.g. at position 185752b4_(—)2, 185752b5_(—)3 and/or 432101b38_(—)1 of FIG. 5; or which has an association with asthma, atopy or a combination thereof e.g. as shown in Table 2c; and if the risk allele is present, administering treatment in order to prevent, delay or reduce the disease.

Preferably, the step of determining the presence or absence of a risk allele is carried out in accordance with the sixteenth aspect, and therefore also comprises determining the presence or absence of risk alleles of SNPs of Table 1, or any combination thereof, for example as described above.

There is also provided a method of preventing or treating an IgE mediated disease or of non-atopic asthma in an individual, the method comprising modulating the expression of the CLLD8 and/or ANGE and/or CLLD7 genes or any combination of two or more of the genes or of any splice variant of the genes. Preferably, the method modulates the production of a gene product according to the first to eighth aspects of the invention. In particular, the twenty-fifth aspect may be achieved by modulating the activity of the CLLD8 promoter, or modulating the splicing of the CLLD8 and/or ANGE genes.

By modulating or influencing is meant inhibiting, enhancing or otherwise altering the expression.

The prevention or treatment of disease according to the twenty-fifth aspect may include the administration of any agent capable of modulating the effects of the ANGE, CLLD7 or CLLD8 genes or any combination of two or more of these genes or fragments of these genes or any splice variant of the genes or of an allele which has an association with disease. Preferably, the agent is one which is capable of ameliorating the deleterious effects of the risk allele. The methods include, but are not limited to, gene therapy techniques. Gene therapy techniques typically involve replacing the nucleic acid sequence comprising the risk allele, or otherwise down regulating the effects of the risk allele. The nucleic acid sequences of the first to eighth aspect, or sequences anti-sense thereto, will be useful in gene therapy.

By modulating is meant inhibiting or increasing the activity of the gene or gene product. Preferably, the activity is inhibited. The activity of the gene or gene product includes any aspect of its production or function, including transcription and translation of nucleic acid sequences, and assembly of the protein, post-translational modification of the protein and downstream interactions with other factors.

The activity of the ANGE, CLLD7 or CLLD8 gene or any combination of two or more of these genes or any splice variant of the genes can be modulated in a number of ways. For example, the expression of the gene may be inhibited through the use of antisense sequences, such as those of the first to eighth aspects of the invention or by the production of antisense RNA sequences. Such sequences when introduced into a subject by gene therapy will hybridise to the ANGE, CLLD7 or CLLD8 gene or to any transcript which is a combination of two or more of these genes or to any splice variant of the genes or to RNA transcribed from the gene or genes, and inhibit its transcription or translation. This method may be particularly useful where it is desirable to modulate the function or expression of certain splice variants of ANGE, CLLD7 or CLLD8 or certain combinations of these genes whilst not affecting others.

Introduction of a nucleic acid sequence may use gene therapy methods including those known in the art. In general, a nucleic acid sequence will be introduced into the target cells of a subject, usually in the form of a vector and preferably in the form of a pharmaceutically acceptable carrier. Any suitable delivery vehicle may be used, including viral vectors, such as retroviral vector systems, which can package a recombinant genome. The retrovirus could then be used to infect and deliver the polynucleotide to the target cells. Other delivery techniques are also widely available, including the use of adenoviral vectors, adeno-associated vectors, lentiviral vectors, pseudotyped retroviral vectors and pox or vaccinia virus vectors. Liposomes may also be used, including commercially available liposome preparations such as Lipofectin®, Lipofectamine®, (GIBCO-BRL, Inc. Gaitherburg, Md.), Superfect® (Qiagen Inc, Hilden, Germany) and Transfectam® (Promega Biotec Inc, Madison Wis.).

Other means to modulate a biological activity of the ANGE, CLLD7 or CLLD8 gene or any combination of two or more of these genes or any splice variant of the genes includes using agents which may affect interaction of ANGE, CLLD7 or CLLD8 or any combination of two or more of these genes or any splice variant of the genes with downstream factors with which they interact.

Also provided is an agent capable of influencing expression of the ANGE, CLLD8 or CLLD7 gene, for use in a method of preventing or treating an IgE mediated disease or non-atopic asthma in an individual. Preferably the agent is capable of influencing the activity of the ANGE, CLLD8 or CLLD7 gene promoters or of any combination of two or more of the gene promoters. Preferably the agent is capable of influencing RNA splicing of the ANGE, CLLD8 or CLLD7 gene or of any combination of two or more of the transcripts of the genes. Agents include nucleic acid sequences of the first to eighth aspects, polypeptide sequences of the tenth aspect, antibodies of the eleventh aspect, and any other agent defined herein, preferably those which are capable of modulating the activity of ANGE, CLLD7 or CLLD8 or any combination of two or more of these genes or any splice variant of the genes.

The subject may be any animal, preferably a mammal, and more preferably human.

Also provided is the use of an agent as defined above in the manufacture of a medicament for use in the prevention or treatment of an IgE mediated disease or non-atopic asthma, as defined above, in a subject.

According to a twenty-sixth aspect of the invention, there is provided a number of screens. A first screen provides for identifying an agent, which modulates the activity of the ANGE, CLLD7 or CLLD8 gene or any combination of two or more of these genes or any splice variant of the genes comprising:

providing a polypeptide sequence as claimed in the tenth aspect of the invention; providing a substrate; providing an agent to be tested; measuring whether the agent to be tested modulates the activity of the polypeptide by measuring processing of the substrate.

The components of the screen are combined, in any optional order, more than 1 substrate or polypeptide may be included in the assay.

In the screening assay the polypeptide may be any polypeptide according to the tenth aspect of the invention. Fragments of the ANGE, CLLD7 or CLLD8 genes or any combination of two or more of the genes or any splice variant of the genes may be used. Also, the ANGE, CLLD7 or CLLD8 genes or any combination of these genes or any splice variant of the genes which comprise one or more SNP nucleic acid sequences of the present invention, such as described in the first to eighth aspects may be used. The polypeptide may be purified or non-purified. The polypeptide may be soluble. It may comprise one or more of the domains.

The agent being tested is being identified for use in the prevention or treatment of an IgE mediated disease or disorder or in non-atopic asthma. IgE mediated diseases or disorders include: asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis.

The substrate may be any which is processed by a polypeptide according to the tenth aspect of the invention. By processed is meant any changes which can be measured. These substrates may be fluorescently labelled or modified to allow easy detection of processing. Such labelling or modification is known to the person skilled in the art.

In a preferred embodiment, the assay is any means of measuring histone methyl transferase activity or nucleotide exchange factor activity known to the person skilled in the art. For example Hama et al J. Biol. Chem., 274: 1528-15291 1999.

The present invention further provides a screen for identifying an agent which modulates the activity of the ANGE, CLLD7 or CLLD8 genes or any combination of two or more of these genes or any splice variant of the genes comprising:

providing a polypeptide according to the tenth aspect of the invention; providing an agent to be tested; providing a cell; and measuring whether the agent to be tested modulates the activity of the polypeptide by measuring adhesion of the cell to a surface.

Such a screen can be referred to as a cell adhesion screen (or assay). The components of the screen are combined, in any optional order.

Typically cells used in the cell adhesion assay may be maintained in suspension where adhesion is measured by aggregation of the cells due to intercellular adhesion molecule interactions. Alternatively, adhesion to a surface may be measured. The surface may be a non-biological molecule e.g. tissue culture plastic or it may be a biological molecule, which is cellular or non-cellular. Examples of a non-cellular molecule include extracellular matrix components such as fibronectin, collagen and such like. One or more cells or other biological non-cellular molecules may be attached to a surface such as a tissue culture surface or an extracellular matrix component-coated surface. Adhesion is determined by measuring the adhesion of a cell to a surface. Modulation in cell adhesion may be either an increase in cell adhesion or a decrease in cell adhesion. An agent is considered to be a modulator of the polypeptide of the tenth aspect if it affects the activity or expression of the polypeptide, this may be either at the level of expression of the ANGE, CLLD7 or CLLD8 genes or of the expression of any combination of two or more of the genes or any splice variant of the genes or by altering the half life of any of the ANGE, CLLD7 or CLLD8 mRNA or polypeptide molecules or of any combination of two or more of the genes, or of any splice variant of the genes or by affecting the post-translation modification status of the ANGE, CLLD7 or CLLD8 polypeptides or the polypeptide encoded by any combination of two or more of these genes or the polypeptide encoded by any splice variant of the genes.

The cell may be the host cell of the fourteenth aspect of the invention comprising the vector of the ninth aspect. The twenty-sixth aspect also includes the use of the host cell in screens to identify an agent.

Yet a further aspect of the invention provides a screen for identifying an agent which modulates the activity of the ANGE, CLLD7 or CLLD8 genes or the activity of any combination of two or more of the genes or the activity of any splice variant of the genes comprising:

providing a polypeptide according to the tenth aspect of the invention; providing an agent to be tested; providing a cell; measuring a change in differentiation or proliferation of the cell.

The components of the screen are combined, in any optional order.

Typically, differentiation may be measured by any means known to the persons skilled in the art for example in the case of a B-lymphocyte, the change in differentiation can be B-cell activation. The cell may express one or more of the polypeptides of the tenth aspect or be the host cell of the fourteenth aspect. In the case of other cell types it may be the induction or prevention of production of a cell signalling factor such as an immunomodulator e.g. a cytokine or growth factor. The cell signalling factor may be secreted. The immunomodulator may be a peptide regulatory factor or may be any other biological substance which expression is altered by an agent which modulates the ANGE, CLLD7 or CLLD18 gene or any combination of two or more of these genes or any splice variant of the genes. Typically this assay is performed in vitro for example in tissue or organ culture and the cell may be cultured following removal from a patient or animal or the transgenic animal of the fifteenth aspect.

The change in phenotype may be any. It may involve a change in B-cell phenotype.

Such a screen provides an in vitro model for identifying an agent which modulates the activity of the ANGE, CLLD7 or CLLD8 gene or any combination of two or more of these genes or any splice variant of the genes.

Yet a further aspect of the invention provides a screen for identifying an agent which modulates the activity of the ANGE, CLLD7 or the CLLD8 gene or any combination of two or more of the genes or any splice variant of the genes comprising:

providing a transgenic animal according to the fifteenth aspect of the invention; providing an agent to be tested; contacting the transgenic animal with the agent to be tested; detecting a change in the transgenic animals phenotype.

The components of the screen are combined, in any optional order.

The cell against which the agent is tested may be in suspension, tissue culture, as part of an organ or as part of an animal. Preferably the animal is a laboratory animal, such as a rat, rabbit, mouse or other rodent.

A change in phenotype includes a change in gene expression or in the production of RNA or protein or in cell morphology or behaviour.

Yet a further aspect of the invention provides a screen for detecting a side effect associated with the use of an agent which modulates the activity of the ANGE, CLLD7 or CLLD8 gene or any combination of two or more of the genes or any splice variant of the genes comprising:

providing a cell which does not substantially express the nucleic acid sequence of the first to eighth aspects of the invention or the polypeptide of the tenth aspect of the invention; providing an agent to be tested; contacting the agent to be tested with the cell; and measuring any side effect produced by the agent on the cell.

The components of the screen are combined, in any optional order.

The side effect to be measured may be any, and may depend on whether the cell is part of a larger tissue or animal. It may involve a change in cell differentiation, or cell proliferation. The side effect may be a measure of the change of phenotype of an organ or animal.

Yet a further aspect of the invention provides a screen for identifying an agent which modulates the activity of a polynucleotide according to the first to eighth aspects of the invention or a polypeptide according to the tenth aspect of the invention comprising:

providing an isolated nucleic acid according to the first to eighth aspects of the invention or a polypeptide according to the tenth aspect of the invention; providing an agent to be tested; measuring whether the agent to be tested modulates the activity of the isolated nucleic acid or polypeptide by measuring the interaction of the agent with the sample of nucleic acid or polypeptide.

Preferably this screen is an in vitro transcription assay, measuring transcription of the ANGE, CLLD7 or CLLD8 gene or any combination of two or more of these genes or any splice variant of the genes.

Alternatively, an agent may be identified by the use of theoretical or model characteristics of the ANGE, CLLD7 or CLLD8 gene or of a transcript produced by a combination of two or more of these genes or any splice variant of the genes. The functional or structural characteristics may be of the protein itself or of a computer generated model, a physical two or three-dimensional model or an electrical (e.g. computer) generated primary, secondary or tertiary structure, as well as the pharmacaphore (three dimensional electron density map) or its X ray crystal structure.

Putative agents will include those known to persons skilled in the art or new substances, and include chemical or biological compounds, such as anti-sense nucleotide sequences, polyclonal or monoclonal antibodies which bind to a polypeptide sequence of the tenth aspect.

According to a twenty-seventh aspect of the invention, there is provided the use of a nucleic acid sequence or polypeptide sequence as defined above in a screen for an agent which modulates the activity of the ANGE, CLLD7 or CLLD8 gene or of any combination of two or more of the genes or of any splice variant of the genes.

The method preferably comprises contacting a putative agent with a nucleic acid or polypeptide sequence according to an aforementioned aspect of the present invention and monitoring expression and/or activity of the nucleotide or polypeptide sequence. Potential agents are those which alter the activity or expression of the polynucleotide or polypeptide sequence compared to the activity or expression in the absence of the agent. The present method may be carried out by contacting a putative agent with a host cell, tissue culture, or transgenic non-human animal comprising a nucleotide or polypeptide according to the invention, and displaying inflammatory disease.

Also provided are agents identified by the methods of the twenty-sixth aspect.

Preferred features for the second and subsequent aspects of the invention are as for the first aspect mutatis mutandis.

FIG. 1 shows a Linkage Disequilibrium map of the atopy locus.

a) Linkage Disequilibrium Map of the Atopy Locus

A GOLD plot²⁸ of colour coded pair-wise disequilibrium statistics (D′) between markers is shown. The locus extends from the bottom left of the figure to the top right. Red and yellow indicate areas of strong LD. LD is approximately divided into four regions. The scale bar at the bottom indicates a distance of 200 Kb.

b) Detail of LD Around the ANGE (NY-REN-34) Gene Complex

Association to IgE levels is shown above the figure. Genes are shown as black arrows, pointing in the direction of transcription. The scale bar at the bottom indicates a distance of 100 Kb

FIG. 2 shows the extent of linkage disequilibrium between SNPs on chromosome 13q14.

FIG. 3 a) shows the schematic structure of the gene ANGE

-   -   b) Detail of structure of ANGE promoter and alternate exons     -   c) Pile up of transcripts by amplification between alternative         Exon I and Exon III     -   d) shows transcription of ANGE to immune tissue from alternate         Exon 1.

FIG. 4 shows splice variation in ANGE.

a) PCR Amplification of Exons 1-3 in Multiple Tissue cDNA Panels

The presence of a smaller band indicating absence of exon 2 is observed in all tissues

b) PCR Amplification of Exons 4-6 in Multiple Tissue cDNA Panels

The presence of additional bands indicating retention of a additional exons (Va and Vb) is observed in lung and immune tissues. The band is present in cDNAs from unactivated lymphocytes, and absent in activated lymphocytes.

c) PCR Amplification of Exons 7-8 in Multiple Tissue cDNA Panels

The presence of additional bands, indicating retention of exon 7 is observed in lung liver, kidney and pancreas, and immune tissues. The band is most highly expressed in cDNAs from resting T and B cells.

FIG. 5 shows the nucleotide sequence of the ANGE 1 gene (NY-REN-34) as nucleotides 313649-346509 the nucleotide sequence of CLLD8 as nucleotides 294727-309803 and the sequence of CLLD as nucleotides 349634-410846 of BAC bA103J18.03548.

FIG. 5 a shows the following sequences for the ANGE gene:

-   -   (i) Exon sequences;     -   (ii) Protein sequence;

FIG. 5 b shows the following sequences for the NY-REN-34 gene:

-   -   (i) mRNA sequence;     -   (ii) Protein sequence;     -   (iii) Alternative protein sequence;

FIG. 5 c shows the following sequences for the CLLD7 gene:

-   -   (i) Exon sequences;     -   (ii) Protein sequence.

FIG. 5 d shows the following sequences for the CLLD8 gene:

-   -   (i) Exon structure and nucleic acid sequence;     -   (ii) Protein sequence.

FIG. 6 shows the nucleotide sequence of BAC bA101d11.01116.

FIG. 7 shows the nucleotide sequence of BAC bA236 m15.00303.

FIG. 8 shows domain architectures of CLLD8, ANGE and related proteins, drawn approximately to scale.

Proteins are described in the text. The SET domains of CLLD8 and ESET are bifurcated due to the presence of large insertions. The IL-5 promoter REII-region-binding protein, that arises from an mRNA initiated within a middle intron of the WHSC1/MMSET gene⁴⁶, is shown.

Domain symbols: AT, AT-hook DNA-binding motif; B, bromodomain; C (white-on-black), cysteine-rich regions flanking SET domains; C, FYRC domain; C₅HCH, zinc finger domain; CpG, methyl-CpG-binding domains; CXXC, CXXC-type zinc finger; HMG, high mobility group domains; N, FYRN domain; P, PHD domains with two (blue rectangle) and one (yellow rectangle) Zn2+-coordinating groups; PW, PWWP domains; SET, Su(var)3-9, Enhancer-of-zeste, trithorax methyltransferase domains; and, T, tudor domains.

FIG. 9 shows co-expression of the CLLD8/ANGE gene complex.

FIG. 10 shows Northern Blots of the gene for ANGE (NY-REN-34) and CLLD8. The presence of large bands and differential splicing in PBMC is apparent.

The figures show alternative probings of the same Northern blot. The expected transcript size of approximately 1.6 Kb for ANGE is seen in all tissues. A 3.0 Kb band is prominent in lymph node and thymus, and polymorphic higher molecular weight bands between 6.0 and 8.0 Kb are visible in immune tissues. The expected 4.0 Kb band is seen for CLLD8 in all tissues. Additional higher molecular weight bands may also been seen, but their distribution does not coincide with that of ANGE.

FIG. 11-Western blot of Cos-7 cells lysates after transient transfection with pcDNA4 expression vectors.

3×10⁵ cells were transfected with 1 μg pcDNA4H is/MaxLacZ only (lane 1) or with 1 μg pcDNA4His/MaxLacZ plus 6 μg pcDNA4His/Max-CLLD7 (lane 2), 6 μg pcDNA4His/Max-CLLD8 (lane 3), 6 μg pcDNA4His/Max-REN34 (lane 4) and 6 μg pcDNA4His/Max-ANGE (lane 5). Cells were harvested 24 hours post-transfection and whole cell lysates run on a 10% polyacrylamide gel. Proteins were transferred to PVDF membrane, which was probed with an anti-polyhistidine antibody.

FIG. 12—HMT radioactive assay using nuclear extracts from Cos7 cells transfected with pcDNA CLLD8.

Cells were harvested either 24 or 48 hrs post-transfection with nuclear extracts being tested for HMT activity. The control extracts were treated with Fugene only. All Cos7 extracts equivalent to 11 ug total protein. Positive control=bacterial SUV39H1(82-412) 10 ug; negative control=no protein.

FIG. 13 shows examples of results obtained with each of the DNA probes containing a potentially functional SNP.

FIG. 14 shows the results of such immunolocalisation experiments. CLLD7 (a) and ANGE (b) have a punctuate cytoplasmic localisation whereas CLLD8 (c) appears to be restricted to the nucleus of the COS-7 cells.

TABLE 1 shows associations between LnIgE and the identified SNPs in BAC bA103J18.03548, BAC bA101d11.0116 and BAC bA236m15.00303.

The position is in base pairs from the beginning of the reference sequence from the BAC/PAC contig.

TABLE 2 shows association of common b4_(—)2.b5_(—)3.b43_(—)1 haplotypes to total IgE in subject panels.

-   -   a) SNP/marker associations with total serum IgE.     -   b) Association of common b42.b5_(—)3.b43_(—)1 haplotypes to         total IgE in subject panels.     -   c) Association with categorical traits (asthma and atopy).

TABLE 3 shows the full length cDNAs isolated from a 1 mb region of FIG. 5.

TABLE 4 shows the primer pairs used in the identification of the SNPs.

-   -   a) SNPS identified in region.     -   b) Shows the primer pairs used in RFLP assays.     -   c) SNP sequences

PCR amplification between exon XII of CLLD8 and exon III of ANGE. Three bands are observed (A, B and C) and their splice structure is depicted in the insert. Band B, which lacks ANGE exon II, has an open reading frame that continues from CLLD8 through to ANGE. The highest molecular weight band arises from priming from a homologous sequence within the IgGFc locus.

TABLE 5 shows peptide sequences used to generate antibodies.

TABLE 6 shows putative functional promoter SNPs tested by EMSAs.

EXAMPLE 1 Subjects

The primary mapping was carried out on 364 subjects in 80 nuclear families sub-selected from a population sample of 230 families from the rural town of Busselton in Western Australia^(3,51) (The AUS1 panel). Families in the panel included both atopic and non-atopic members, and sibships of three or greater were not exclusively atopic or non-atopic. The AUS2 panel consisted of the remaining 150 nuclear families from the population sample. The UK2 panel consisted of 87 nuclear families recruited through a child attending an asthma clinic in the Oxford region. The families contained 216 offspring (148 sibling pairs). The ECZ panel consisted of 150 nuclear families recruited from the dermatology clinics at the Great Ormond Street Hospital for Children, through a child or children with active AD, as previously described³⁰.

Phenotypes

Skin tests to House Dust Mite (HDM) and mixed grass pollen (less the response of negative controls), specific IgE titres to HDM and Timothy Grass, and the total serum IgE were measured as previously described (Hill, M. R., James, A. L., Faux, J. A. & et al British Medical Journal 311, 776-9 (1995)). A “Skin Test Index (STI)” was calculated as the sum of the prick skin test results to HDM and grass mix (95% of individuals in this population who were atopic reacted either to HDM, or to grass pollen or both). Bronchial responsiveness to methacholine was measured as previously described: the maximum dose administered was 12 μmol. The slope of the dose-response curve was calculated as (pre-dose forced expiratory volume in one second (FEV1)— last FEV1), the cumulative dose of methacholine. A constant of 0.01 was added to each measurement, to allow log_(e) transformation when Slope was 0. Eosinophils in peripheral blood were Coulter-counted and the values log_(e) transformed before analysis.

“Atopy” was defined as a STI>5 mm, or a RAST score to HDM and Timothy Grass>2, or a total serum 1 gE>the 7^(th) decile of the age-corrected population. “Normal” was defined as a STI of 0 and a RAST Index of 0, and a total serum IgE<the 7^(th) decile of the age and sex-matched population. Intermediate phenotypes were classified as unknown.

The subjects were administered a modified British MRC questionnaire as previously described. “Asthma” was defined as a positive answer to the questions “Have you ever had an attack of asthma?” and “If yes, has this happened on more than one occasion?”

SNP Discovery and Typing

Discovery of SNPs was performed through direct sequencing of non-repetitive DNA fragments that were greater than 1000 Bp in length. For each sequence reaction, primers designed covering 500-600 bps genomic sequence. Five individual samples and one pooled DNA panel of 32 individuals were sequenced. Traces were assembled by the Polyphred/Phrap programmes. Following this random SNP discovery, sequencing of all exons with 250 bp leading and trailing DNA was carried out for all potential candidate genes from the region.

SNP typing was by PCR and restriction digestion. In the absence of a natural restriction site, one primer was modified to generate a restriction site. PCR was carried out in 10 μt reaction which contained 1) 5 μl 10 ng/μl individual DNA; 2) 5 pmol forward and reverse primers; 3) 0.08 u TaqGold; 4) 1.5-3 mM Mg²⁺. PCR carried at 94° C. for 15 minutes then 35 cycles for 1) 94° C. 30 s; 2) 50-60° C. 30 s; 3) 72° C. 45 s. After finishing PCR, the plates were tested for checking if the PCR worked, then 5 μl digestion solution which contained 1-2 u restriction enzyme added to each reaction. Samples were run in 2-4% agrose gels after 3-5 hours digestion. Total 40 SNPs were typed in the Busselton and UK1 family sets.

Statistical Analysis of Association

Errors in SNP typing were detected by testing for Mendelian errors and by the MERLIN computer program (http://bioinformatics.well.ox.ac.uk/Merlin), which identifies improbable recombination events from dense SNP maps. SNP haplotypes were generated by MERLIN and recoded as individual alleles.

Tests of association to quantitative traits were carried out by the QTDT program, which allowed use of markers and phenotypes as covariates in analyses²⁹. Association to asthma and categorical traits was examined by the Monks test routine of QTDT^(29,31)

Sequence Analyses

Genomic sequence was analysed using a modification of HPREP (G. Micklem, unpublished); screening for repeat elements in RepBase⁵² using REPEATMASKER (Smit, A. F. and Green, P. http://repeatmasker.genome.washington.edu); for matches to human, rodent, EST, STS and other DNA databases, SWISSPROT, TREMBL and TREMBLNEW peptide databases, CpG islands using CPG⁵³, transcription factor elements and putative promoter regions using PROMOTERSCAN⁵⁴, and exon predictions using GRAIL⁵⁵, GENSCAN⁵⁶, GENEPARSER⁵⁷ and MZEF⁵⁸ Annotations were collated using ACeDB (http://www.acedb.org/). Known genes were identified using BLASTN⁵⁹ against the EMBL DNA database. The peptide databases SWISSPROT, TREMBL and TREMBLNEW were searched using BLASTX⁵⁹ for homologues to transcripts of unknown function. Putative roles for remaining transcripts were established using PSI-BLAST⁵⁹ and SMART⁶⁰.

IMAGE Clone Sequence and Extension

IMAGE clones mapping to the region were obtained from Research Genetics and sequenced on a 377 DNA sequencer using ABI Prism Big Dye Terminator (PE Applied Biosystems). Consensus sequences for each IMAGE clone was aligned by the GCG program. Marathon-Ready™ cDNA RACE libraries were obtained form CLONTECH to extend 5′ and 3′ cDNA ends of the IMAGE clones. Two gene-specific primers (GSP) were designed for each direction for each consensus. Distinct bands from RACE PCR were cut from gels and purified. The bands were cloned with ZERO Blunt™ PCR Cloning kit from Invitrogen. The inserts were sequenced using Big Dye Terminator, and integrated into consensus sequences with GCG.

Tissue Expression

Human Multiple Tissue Northern (MTM™) Blots and Human Immune System MTN blots were obtained from CLONTECH. Human Multiple Tissue, Human Immune System and Human Blood Fractions Multiple Tissues cDNA Panels from CLONTECH were used for expression analysis by PCR amplification of target sequences.

Systematic investigation of the exonic and intronic structure of the splice variants of ANGE was carried out by selective PCR, gel separation of products, cloning with ZERO Blunt™ PCR Cloning kit, and Big Dye Terminator sequencing.

EXAMPLE 2

A saturation genetic map of chromosome 13q14 identified a one lod support unit for the location of the atopy locus within a 7.5 cM region centred on D13S161¹⁷. A 1.5 Mb BAC and PAC contig was constructed, centring on D13S273. A positive association between the total serum IgE and alleles of the microsatellite USA724GI in two panels of families⁴ was found.

The limit of detection of linkage disequilibrium (LD) between a disease and a marker given our sample size is likely to be less than 100 Kb^(18,19), suggesting that the atopy gene was within 100 Kb in either direction of USA724GI. This region of chromosome 13q14 is commonly deleted in B-cell chronic lymphocytic leukaemia (BCLL)²⁰.

Scaffold sequence tag sites (STSs) from our BAC/PAC contig were then used to prioritise genomic sequencing of the central 1 Mb of the locus. These STSs were mapped on to BAC contigs built by a combination of Hind III digest fingerprinting and STS content^(23,24). An overlapping set of clones from the RPCI-11 BAC library²⁵ was sequenced using a hierarchical shotgun sequencing strategy²⁶. The sequence of the region (in the form of 3 BACs) is shown in FIGS. 5, 6 and 7.

Linkage Disequilibrium Mapping

SNPs were detected by sequencing repeat-free contigs than 1.5 Kb in length in 5 unrelated atopic subjects and 5 unrelated controls, together with a pool of DNA from 32 unrelated individuals.

Association to Quantitative Traits

Forty-seven SNPs and a 15 bp deletion-insertion polymorphism were identified with minor allele frequencies $ 20%. These were genotyped in our primary panel of 364 individuals in 80 nuclear families^(3,4). Error checking and haplotype generation was carried out by the MERLIN computer program²⁷. Linkage disequilibrium (LD) between markers was assessed by estimation of D′ from the parental haplotypes¹⁸ and portrayed by the GOLD program²⁸. LD was roughly distributed into three major and one minor islands (A, Aii, B and C)(FIG. 1 a), defining regions in which association to disease could be localised.

Association was sought between the LD (IgE concentration) (LnIgE) and the SNPs by variance components analyses²⁹ (Table 1).

Association to LnIgE clustered around the 185752b4_(—)2 SNP and extended for approximately 100 Kb within the A and Aii islands of LD (Table 1). Association to the STI was less well defined, but seemed to be centred around 432101b38_(—)1 and extend for 150 kb. The distance between the two peaks was approximately 160 kb.

In order to test if these peaks corresponded to distinct QTLS, associations to the LnIgE were tested with the STI as a covariate and vice versa (Table 1). Associations were similarly tested with 185752b4_(—)2 and as covariates. In each case the LnIgE/185752b4_(—)2 complex appeared as distinct from the STI/432101b38_(—)1 complex.

The region of association to LnIgE extended across 3 genes (Table 1) (FIG. 1 b). The identification of the genes is described below. Inclusion of markers in the first or third genes as covariates (b1_(—)1 or 44593_(—)15) did not abolish the association within the middle gene, whereas the use of a marker in this gene as a covariate (b4_(—)2) removed the evidence for association in the outer regions. These results suggested that the QTL is contained within the centre of markers, which show association to the LnIgE.

In order to test for replication, six markers (b11_(—)2, b4_(—)2, b5_(—)3, b43_(—)1, b38_(—)1 and b28_(—)2) were typed in other panels of subjects. In order to minimise the numbers of comparisons, the markers were assembled into 3-marker haplotypes, and multi-allelic tests of association were performed before examining individual haplotypes. The b4_(—)2, b5_(—)3, b43_(—)1 haplotype showed consistent association to the LnIgE in each of the panels tested (Table 2). Two haplotypes containing the b4_(—)2*2 and the b5_(—)3*1 alleles (A and D, Table 2) showed negative association with the LnIgE, although they differed at the b43_(—)1 locus. Positive association was observed with the C haplotype (containing b4_(—)2*1, and b5_(—)3*2) in a panel of families with atopic dermatitis³⁰. These results further suggest that the polymorphism influencing IgE levels is nearest to b4_(—)2 and b5_(—)3.

The combined panel of Busselton families (AUS1 and AUS2) may be taken to be representative of the general population. Association was seen to asthma in these subjects with the b4_(—)2 and b5_(—)3 markers (p=0.024 and p=0.017 respectively) using a transmission disequilibrium test³¹.

Association to Categorical Traits Gene Identification and Domain Homologies

Unfinished BAC sequence of the region was assembled and annotated. Systematic identification of expressed sequences was carried out by examination of EST databases and from a cDNA selection experiment⁴. Partial sequences from these sources were consolidated into cDNA contigs, and further extended these by 3′ and 5′ RACE and Northern blotting was carried out to determine transcript sizes, and to examine tissue expression of the genes.

Six full length cDNAs were identified from the 1 Mb region of genomic sequence (Table 3). Four other sequences were found in the EST databases, but did not have open reading frames (ORFs) or a splice structure and were likely to be genomic contaminants (UniGene clusters Hs.58452, Hs.268773, and Hs.212161).

Physical mapping of the chromosome 13q14 BCLL locus cell has recently identified 5 of these genes²². Three were considered novel candidate genes for leukemogenesis and were named as CLLD6, CLLD7, and CLLD8. The other genes are karyopherin-∀3 (KPNA3) and the gene corresponding to the NY-REN-34 antigen (ANGE). The sixth gene is Emopamil-binding related protein (EBRP).

Sequence homologies for the three distal genes do not suggest an obvious role in atopy or asthma: EBRP may act as a D8-D7 sterol isomerase in cholesterol biosynthesis, the sequence of KPNA3 suggests that it is involved in the nuclear transport system³², and CLLD6 contains a SPRY domain, suggesting possible microtubule-binding. The three remaining genes form a tight cluster which contains the region of association to LnIgE levels (Table 1) (FIG. 1 b).

CLLD8

The most proximal gene, CLLD8, contains both a methyl-CpG-binding domain (MBD) and a SET domain²²(FIG. 8). The MBD appears to lack amino acids required to bind methylated CpG^(33,34), but remains likely to bind DNA. SET domains modulate gene expression epigenetically through histone H3 methylation³⁵⁻³⁷. CLLD8 is likely to be a H3 methyltransferase since it contains both active site and flanking cysteine residues that are important for catalytic activity³⁷.

Histone Methyl Transferases

The expression of genes in eukaryotic organisms is dependent on DNA accessibility. In its natural state, DNA is packaged around a set of histones, H2A, H2B, H3 and H4. Further higher order compaction is facilitated by the interaction with H1 histone and other non-histone proteins. In this condensed state, chromatin is inaccessible to the transcription machinery and thus genes contained within it are silent. Histone methyl transferases play a critical role in the regulation of gene expression. In mammalian cells, these enzymes are known to methylate histones H3 and H4 at specific lysine residues. The most widely studied member of this protein family is SUV39H1, which selectively methylates histone H3 at lysine residue 9 (K9). The catalytic domain of this enzyme is contained within a highly conserved sequence known as the SET domain. This sequence is required in combination with two flanking cysteine-rich sequences (Pre-SET and Post-SET) to facilitate histone methylation. Thus the PreSET-SET-PostSET domain is regarded as a characteristic signature of histone methyltransferase proteins. CLLD8 contains an expanded SET domain and a methyl binding domain, a structure that is capable of recognising methylated DNA.

Reference: Kouzarides, T. “Histone Methylation in transcriptional control” (2002) Curr Opin Genet and Devel 12:198-209

ANGE (NY-REN-34)

The next gene is approximately 4 Kb distal to the 3′ end of CLLD8 and is transcribed in the same direction. It encodes NY-REN-34 antigen which was identified by serological analyses of cDNA products from four patients with renal cell carcinoma³⁸. Transcripts of the gene are also highly represented in stomach, tonsil and in B-cells (UniGene cluster 279799). The gene product contains two PHD (plant homeo domain) zinc fingers, which suggest its involvement in chromatin-mediated transcriptional regulation³⁹ (FIG. 8). PHD fingers normally posses two Zn²⁺ co-ordinating groups which contain cysteine and histidine residues. The N-terminal (5′) of the NY-REN-34 finger pair however lacks one of the two coordinating groups.

The arrangement of PHD fingers in NY-REN-34 is characteristic of human proteins such as ALL-1 and AF10 whose genes are fused in some cases of acute lymphoblastic leukaemia⁴⁰ (FIG. 8). Analogy to AF1041 and ALL⁴² PHD fingers suggests that the NY-REN-34 PHD finger pair is likely to possess a homodimerisation or a protein-binding role or both. NY-REN-34 has also been called BCAP (BRCA1-C terminus associated protein) (EMBL accession AB011031) and is likely to interact with the BRCT domains of BRCA1. These domains are capable of stimulating transcription, remodelling chromatin and interacting with histone-modifying enzymes such as the histone acetyltransferase p 300 and the human histone deacetylase, HDAC⁴³.

CLLD7

CLLD7 follows only 3 Kb from the end of ANGE, but is transcribed in the opposite direction. It shows strong protein sequence similarity to RLG and RCC1 (regulator of chromatin condensation 1). RCC1 binds to DNA and to histones H2A and H2B⁴⁴. CLLD7 contains a BTB/POZ domain, which classically form homophilic and heterophilic dimers.

Remodelling of chromatin structure is important in transcriptional regulation of genes influencing IgE production⁴⁵, so CLLD8, ANGE and CLLD7 may all be considered candidates for influencing atopic processes. However, CLLD8 and ANGE both contain domain homologies to the IL5 promoter REII-region-binding protein (RE-IIBP)⁴⁶, as well as to genes found in leukaemia (ALL)⁴⁰ and multiple myeloma (MMSET)^(46,47) (FIG. 8). This suggests a role in immune regulation and immunoglobulin production.

CLLD8 and ANGE (NY-REN-34) Co-Expression

The close genomic proximity of CLLD8 and ANGE raises the possibility of co-ordinate expression of both genes, or expression of a combined gene product that would be similar to RE-IIBP⁴⁶. PCR between exon XII of CLLD8 and exon III of ANGE identified three bands (A, B and C) which were expressed in most tissues (FIG. 10). Each band contained a specific splice structure (FIG. 3), but our sequence did not identify an extended open reading frame in any of the bands. Despite the absence of an open reading frame, the non-random splice structure suggests a function, which may be regulatory.

Tissue Expression and Splice Variation

We examined northern blots of CLLD8, ANGE and CLLD7. The northern for ANGE showed polymorphic high molecular weight bands (FIG. 10). High molecular weight bands were also seen with CLLD8, but these did not match the tissues in which similar bands were seen with ANGE (FIG. 2). CLLD6 and the other genes from the contig showed the expected size bands with a uniform and ubiquitous tissue distribution.

The ANGE gene contains 10 exons. Examination of the public databases identified a number of alternative first exons with alternative start methionines for protein translation (EMBL:AF155105, AL552215, B1463029, BG759124). We have been able to identify all of these variants by sequencing of specific PCR products from cDNA panels (data not shown). In addition, versions of the cDNA show skipping of exon II were found (ESTHUM:BF662927 and BE787177, EMBL:AL552215). The AL552215 variant results in an incomplete first PHD domain, that would not be anticipated to be functional.

Exon-specific PCR of cDNA from multiple tissue panels identified exon II skipping variants to be present at approximately the same concentration in all tissues (FIG. 4 a).

Highly tissue-specific splice variants were found which contained additional exons between exons V and VI (named exon Va and Vb). These variants differed by 54 bps, and were present in lung and peripheral blood leucocytes (PBL) (FIG. 4 b). Examination of PBL fractions showed that the splice variants were present in unactivated CD4+, CD8+ and CD19+ cells, but absent in activated cells. Both exons result in a premature stop codon. Alternative splicing with a premature stop codon has previously been identified as mechanism for negative control of transcription⁴⁸, and a negative role for these variants is consistent with their expression in inactive T and B Cells.

A splice variant in which intron VII was retained between exons VII and VIII (ESTHIUM:BE141730), which was most strongly expressed in active CD4+ and CD8+ leukocytes was observed (FIG. 4 c). This variant also results in a premature stop codon.

ANGE

The identification of ANGE (NY-REN-34) by positional cloning rests on three lines of evidence: genetic localisation, tissue expression, and inferred or demonstrated gene function. In the present case, although the region of association to the total serum IgE concentration extends across three genes, our analysis suggests that this is attributable to polymorphism within the gene for ANGE (FIG. 1, Table 1, and Table 2). Domains from CLLD8 and ANGE have homologies with known B-Cell transcription factors. Only one gene, ANGE, has differential expression in immune cells and tissues, and is likely to be responsible for atopy at this locus.

We have carried out further sequencing of ANGE and CLLD8 for 5′ regions, all exons, and non-repetitive areas of introns. One conservative coding (Glu-Gly) variant was found in CLLD8, and showed only weak association with the IgE (p<0.01). One non conservative (Pro-Ser) variant was found in ANGE (ANGE1x3C148T). A non conservative coding (Val-Ala) variant was found in CLLD7 (CLD703).

The SNPs within the A island of LD were in strong disequilibrium. Control regions for genes may extend for 100 Kb⁴⁹, and we have observed at least two haplotypes with different effects on serum IgE levels. Our results indicate that loci underlying complex traits will contain several polymorphisms with different functional consequences⁵⁰.

EXAMPLE 3 Sequence Analyses

DNA sequence from overlapping, unfinished BACs was assembled to form larger contigs using contigwalk, a systematic comparison and extension tool using BLASTN (S. J. Broxholme, unpublished). A framework map of the region of interest was prepared using vector scores of STS markers from the critical region in the Genebridge4 Radiation Hybrid panel and the Radiation Hybrid mapping software RHMAPPER (Stein, L., Kruglyak, L., Slonim, D., Lander, E. (1995) http://www.genome.wi.mit.edu/ftp/pub/software/rhmapper.) RHMAPPER was then used to find markers in RHDB (Rodriguez-Tome, P. & Lijnzaad, P Nucleic Acids Res 29, 165-6. (2001)) that could be placed within this framework.

For each RHDB entry placed within the framework, the accession number was found from its annotation, and ESTs were selected for the next stage. TIGR Assembler (Sutton G. G., White O., Adams, M.D. and Kerlavage, A. R. (1995) Genome Science &Technology, 1, 9-19) was used to make a non-redundant set of sequences.

Genomic sequence was analysed using a modification of hprep (G. Micklem, unpublished); screening for repeat elements in RepBase (Jurka, J. Trends Genet 16, 418-20. (2000)) using RepeatMasker (Smit, A. F. A. and Green, P. http://repeatmasker.genome.washington.edu); for matches to human, rodent, EST, STS and other DNA databases, SWISSPROT, TREMBL and TREMBLNEW peptide databases, CPG islands using cpg (Larsen, F., Gundersen, G., Lopez, R. & Prydz, H. Genomics 13, 1095-107. (1992)), transcription factor elements and putative promoter regions using promoterscan (Prestridge, D. S J Mol Biol 249, 923-32. (1995)), and exon predictions using GRAIL (Xu, Y., Mural, R. J. & Uberbacher, E. C. Comput Appl Biosci 10, 613-23. (1994)), GENSCAN (Burge, C. & Karlin, S. J Mol Biol 268, 78-94. (1997)), geneparser (Snyder, E. E. & Stormo, G. D. Nucleic Acids Res 21, 607-13. (1993)) and MZEF (Zhang, M. Q. Proc Natl Acad Sci USA 94, 565-8. (1997)). Annotations were collated using ACeDB (http://www.acedb.org/).

Known genes were identified using BLASTN (Altschul, S. F. et al. Nucleic Acids Res 25, 3389-402. (1997)) against the EMBL DNA database. The peptide databases SWISSPROT, TREMBL and TREMBLNEW were searched using BLASTX (Altschul et al., supra) for homologues to transcripts of unknown function. Putative roles for remaining transcripts were established using PSI-BLAST (Altschul et al., supra) and SMART (Schultz, J., Copley, R. R., Doerks, T., Ponting, C. P. & Bork, P Nucleic Acids Res 28, 231-4. (2000)).

IMAGE Clone Sequence and Extension

IMAGE clones mapping to the region were obtained from Research Genetics and sequenced on a 377 DNA sequencer using ABI Prism Big Dye Terminator (PE Applied Biosystems). The reaction contained 2 μl Bigdye, 2 μl Half Bigdye, 1 μl primer (5 pmol/μl), 14 μl plasmid DNA (400 ng) and1 4-1 μl dsH20. The sequence reactions were performed in MJ thermal cycler. 1); 95° C.—1 min; 2) 95° C.—10 is; 3) 50° C.—10 s; 4) 60° C.—4 mins; 5) repeated step 2 to 4 for 24 cycles [25 cycles total]; 6) 15° C.—hold. Consensus sequences for each IMAGE clone was aligned by the GCG program.

Marathon-Ready™ cDNA RACE libraries were obtained form CLONTECH to extend 5′ and 3′ cDNA ends of the IMAGE clone sequences. Two gene-specific primers (GSP) were designed for each direction for each consensus. There were about 150-200 bps overlapping sequence between the two GSP primers. GSP primers were designed to have 25-28 bps, 50-70% GC and Tm 65° C. Touchdown PCRs were used for the RACE the ends as: 1) 94° C. for 30 s, 2) 25-30 cycles at 94° C. for 5 s, 68-72° C. for 4 min.

Distinct bands from RACE PCR were cut from gels and purified. The bands were cloned with ZERO Blunt™ PCR Cloning kit from Invitrogen. The inserts were sequenced using Big Dye Terminator, and integrated into consensus sequences with GCG.

Northern Blot Analysis

Human Multiple Tissue Northern (MNT™) Blots and Human Immune System MTN blots were obtained from CLONTECH. The Human MTN Blot contained RNA from: heart, brain (Whole), placenta, lung, liver, skeletal muscle, kidney and pancreas. The Human Immune System MTN contained: spleen, lymph node, thymus, peripheral blood leukocyte, bone marrow and fetal liver. The Blots were hybridised with probes generated from cDNA colon or tissue cDNA solution. The average size of probes was 500-1000 bps. For gene il54016, the probe was 1180 bps from exon 1 to exon 10. The probes were radioactively labelled with [³²P]dATP using random primer method. All probes were hybridised at 42° C. overnight in Hybridisation solution (10% Dextran sulphate; 4×SSC; 50 mM sodium phosphate buffer pH7.2; 1 mM EDTA pH 8.0; 10×Denhardts; 50 mg/ml herring DNA sonicated; 1% SDS).

PCR Screening of MTC Panels

Human Multiple Tissue cDNA Panels and Human Immune System Multiple Tissues cDNA Panels from CLONTECH were used for expression analysis. The Human Multiple Tissues cDNA panel contained cDNA from: heart, brain, placenta, lung, liver, skeletal muscle, kidney and pancreas. The Human Immune System Multiple Tissues contained cDNA from: spleen, lymph node, thymus, tonsil, leukocyte, marrow and fetal liver. MTC panels were examined in all cDNA consensus sequence over the five BACs genomic sequence. PCRs were carried out in 50 μl which contained: 1)36 μl deionised H₂O; 2) 5 μl PCR buffer; 3) 1 μl advanTaq Plus; 4) 5 μl 8 mM dNTP and 5) 4 μl 5 μmol forward and reverse primers. PCR was performed as 1) 30 s at 94° C.; 2) 22-38 cycles at 94° C. for 30 s, 68° C. for 2 min; 3) 68° C. for 5 min. In order to observe the abundance of the particular target, a total 5 μl sample each time was removed from the reactions at 22, 26, 30, 34 and 38 cycles.

For ANGE (NY-REN-34), the CLONETECH Human Blood Fractions MTC panel was also tested as above.

Systematic investigation of the exonic and intronic structure of the splice variants of ANGE was carried out by selective PCR, gel separation of products, cloning with ZERO Blunt™ PCR Cloning kit, and Big Dye Terminator sequencing.

Results

We then investigated the genes within the region of association. Unfinished BAC sequence of the region was assembled and annotated. Systematic identification of expressed sequences was carried out by examination of EST databases and from a cDNA selection experiment (G. Anderson. DPhil Thesis, Oxford University 2001). Partial sequences from these sources were consolidated into cDNA contigs and were further extended by 3′ and 5′ RACE. Northern blotting was carried out to determine transcript sizes, and to examine tissue expression of the genes.

Six full length cDNAs were eventually identified from the 1 Mb region of genomic sequence (Table 3). Five other sequences were found in the EST databases, but did not have open reading frames (ORFs) or a splice structure and were likely to be genomic contaminants (i274117, i46536, i513822, i143317 and i447262). Physical mapping of the chromosome 13q14 BCLL locus cell has recently identified three of the genes we found, which were named as CLLD6 (i2350400), CLLD7 (i44593), and CLLD8 (i626548) (Mabuchi, H. et al. Cancer Res 61, 2870-7. (2001)). These genes were considered novel candidate genes for leukemogenesis. Our analysis of the domain content of these genes agrees with the assessment of Mabuchi et al that CLLD7 might be involved in cell cycle regulation by chromatin remodelling and that CLLD8, which contains a SET domain, might be associated with methylation-mediated transcriptional repression. We have observed a SPRY domain in CLLD6, which might be involved in microtubule-binding. CLLD6 seems to be beyond the region of association to LnIgE or the STI.

The gene corresponding to image clone il54016 has been previously recognised to code for NY-REN-34 antigen, which was identified by serological analyses of cDNA from four patients with renal cell carcinoma (Scanlan, M. J. et al. Int J Cancer 83, 456-64. (1999)). Transcripts of the gene have also been consistently found in breast carcinoma and in tonsil. The gene contains two PHD domains, one of which is complete and the second of which is N-terminal (5′) to the first and is missing one of its two zinc co-ordinating residue groups (mostly Cys). These two PHD domains are very similar to pairs of PHD domains in Drosophila/human trithorax and human ALL-1, suggesting that the gene for NY-REN-34 antigen is involved in chromatin-mediated transcriptional regulation (Aasland, R., Gibson, T. J. & Stewart, A. F. Trends Biochem Sci 20, 56-9. (1995)).

Image clone il895799 is Emopamil-binding related protein, which acts as a D8-D7 sterol isomerase. Image clone i626789 is karyopherin alpha 3, with homologies that suggest that it may be involved in the nuclear transport system (Takeda, S. et al. Cytogenet Cell Genet. 76, 87-93 (1997)).

Chromatin structure is important in transcriptional regulation of genes influencing IgE production (Lavender, P., Cousins, D., Smith, P. & Lee, T. Presentation at the National Asthma Campaign International Congress, June 1999. Clin Exp Allergy 30, 1697-708. (2000)) so that the SET domain and PHD domain-containing proteins (CLLD8 and NY-REN-34) are prime candidates for influencing atopic processes.

Mapping of the transcribed sequences back onto the SNP/LD map showed that three of the genes were contained under the peak of association to the LnIgE and S11 (FIG. 1). These genes were NY-REN-34, CLLD7 and the D8-D7 sterol isomerase.

Further examination of the genes was based on their tissue expression. Northern blots of CLLD8 and CLLD7 and Karyopherin—showed ubiquitous expression of a single sized transcript, as previously described. However, NY-REN-34 showed differential splicing, with higher molecular weight bands present in immune tissues (FIG. 2).

This gene was therefore examined in more detail. It consists of 10 exons, and an alternative exon 1 has been seen in W tissue (FIG. 3). Only one of these exons has a promoter (FIG. 3). We could only identify PCR products that contained the promoter-associated exon from multiple cDNA (MTC) panels. Further exon-specific PCR amplification of cDNA from the panels showed a number of unexpected bands (FIG. 4) in tissues. An alternative version within and without exon two was seen, and was present at approximately the same concentration in all tissues (FIG. 4 a). We observed further additional bands when amplifying exons 4-6. These bands were highly tissue-specific, being present in lung and peripheral blood leucocytes (PBL) (FIG. 4 b). Examination of PBL fractions showed that the splice variants were present in unactivated CD4+, CD8+ and CD19+ cells, but absent in activated cells. Sequencing of the splice variant band revealed the presence of an additional exon between exons 5 and 6. This resulted in an immature stop codon. Further splice variants were seen with amplification of exons 7-8 (FIG. 4 c), which were specific to leukocytes.

The evidence therefore suggested that the gene for NY-REN-34 is responsible for atopy at this locus. It is situated at the peak of association to the LnIgE and the STI, its sequence homology suggests that it acts as a regulatory factor, and is differentially spliced in the specific tissues known to be involved in the regulation of IgE and the allergic response. We therefore have named the NY-REN-34 gene ANGE (atopy new gene).

There are many cell types and phenotypic readouts which can be measured to assess the contribution of these genes to the disease phenotype, including cell-cell interactions, inflammatory cell recruitment, inflammatory mediator release, and effector functions. We primarily used B cell lines as the model system for the cell based assays and IgE promoter activity as the main readout of B cell function.

EXAMPLE 4

It is of interest that CLLD8 is in close proximity (approximately 4 Kb) to ANGE, that ANGE has an alternate first exon without a promoter, and that high molecular weight bands were observed on Northern Blots of ANGE. In order to establish if CLLD8-ANGE may on occasion form a single gene product, PCR was performed between CLLD8 and ANGE exons in placental cDNA. A band was observed, indicating the presence with a gene with the domain structure CpGBD-PreSET-SET-PostSET-PHD-PHD. A similar domain structure has been observed in IL-5 promoter REII-region-binding protein (Garlisi, C. G. et al. Am J Respir Cell Mol Biol 24, 90-98. (2001)).

Mutations associated with 13q14 have not been identified in any of the genes mapped to this locus. However, the recognition that CLLD8/ANGE form a single transcript with differential splicing (FIG. 9) suggests that this gene may cause atopy.

EXAMPLE 5 Mammalian Cell Electroporation

Below is the protocol for the transfection of B cells by electroporation used in the IgE-luciferase reporter assay. These electroporation conditions have been optimised for B cells by measuring β-galactosidase activity after transfection with pcDNA4/HisMax-LacZ. The conditions used in the optimisation experiments are given in the Results section.

The human Burkitt lymphoma cell line DG-75 (DSMZ) was cultured in fresh RPMI plus 10% FCS 24 hours before transfection. The cells were harvested by centrifugation at 1000 rpm for 10 minutes and washed once in cold RPMI. 5×10⁶ cells were resuspended in 400 μl cold RPMI and transferred to a 0.4 cm gap electroporation cuvette (BioRad) containing 10 μg pGL2 reporter vector, 10 μg pcDNA4/HisMax expression vector, and 1 μg pRL-TK (Promega) for normalisation of data to account for differences in transfection efficiency. A pulse was delivered at 100 μF and 250V at room temperature using a Gene Pulser II Electroporator (Bio-Rad). Immediately after transfection 600% warm RPMI plus 10% FCS was added to the cells which were transferred to 1 ml warm RPMI plus 10% FCS in a 6-well plate. The cells were cultured at 37° C. for up to 24 hours in the presence or absence of human recombinant IL-4 (Sigma).

Measurement of β-Galactosidase Activity in pcDNA4his/MaxLacZ-Transfected Cells

24 hours post-electroporation the transfection of B-cells was monitored by measuring B-galactosidase activity in cell lysates. Cells were harvested by centrifugation, washed twice in PBS and resuspended in Reporter Lysis Buffer (Promega). After incubating at room temperature for 15 minutes the lysates were centrifuged at 14,000 rpm for 10 minutes and the supernatants added to an equal volume of 2× β-galactosidase assay buffer (200 mM sodium phosphate pH7.3, 2 mM MgCl₂, 100 mM B-mercaptoethanol, 1.33 mg/ml ONPG). The reactions were incubated at room temperature or 37° C. until a yellow coloration had developed. 1 ml of 1M NaCO₃ was added and the absorbance read immediately at 420 nm.

X-gal Staining of pcDNA4His/MaxLacZ-Transfected Cells

Transfected cells were harvested by centrifugation, washed twice in PBS and fixed in 3.7% formaldehyde in PBS for 15 minutes at room temperature. After washing three times with PBS the cells were incubated in X-gal solution (0.2% X-gal, 2 mM MgCl₂, 5 mM K₄Fe(CN)₆.3H₂0, 5 mM K₃Fe(CN)₆) for 2-16 hours at 37° C.

Dual Luciferase Assay

800 μl of transfected-cell suspension was centrifuged at 300 rpm for 10 minutes and washed with 1 ml PBS. The cell pellet was resuspended in 50 μl passive lysis buffer and incubated for 20 minutes at room temperature on a rotating wheel. The lysates were briefly centrifuged at 14,000 rpm and dual luciferase activity was measured in 20 μl of the supernatant using the Dual-Luciferase Reporter System (Promega) according to the manufacturer's instructions.

WST-1 Proliferation Assay

U2OS cells were seeded in a 96-well plate at 3×10³ cells per well and cultured at 37° C. for 24 hours in DMEM plus 10% FCS. Transfections were performed in triplicate using 300 ng pcDNA4/HisMax expressing CLLD7, CLLD8, REN34 or ANGE and 0.2 μl FuGENE 6 transfection reagent mixed in 20 μl Optimem, as described in section a. The cells were cultured for 24, 48, or 72 hours before adding 10 μl per well of WST-1 cell proliferation reagent. Absorbance was measured at 450/690 nm after a one-hour incubation at 37° C.

Histone Methyltransferase Assay

10 μg of SUV39H1(82-412)-GST protein (1 mg/ml) was added to 5 μg biotinylated H3 peptide (first 21 amino acids of Histone H3 with a Biotin label on the CO₂H terminus from Upstate Biotechnology) in MAB (methylation activity buffer as described in Regulation of chromatin structure by site specific Histone H3 methyltransferases, Rea et al. Nature 406, 593-599, 2000. but using Tris-Cl pH 8.0) and water to a final volume of 100 μl, allowing for the addition of 600 nCi of ¹⁴C s-adenosyl methionine as the radioactive substrate to initiate the reaction. This was incubated at 30° C. for 90 minutes. 30-50 μl of PBS washed Streptavidin Agarose beads (Sigma) were added at room temperature for 30 minutes with gentle agitation, to bind all biotinylated H3 peptide. Unbound reaction components were removed by washing beads in at least 10 volumes of PBS, spinning at low speed (−5000 rpm) and removing supernatant to aqueous waste, taking care not to remove any Agarose beads. The beads were then resuspended in 100 μl of PBS and added to 3 ml of scintillant fluid for counting of ¹⁴C labelled methylated H3 peptide.

Screening Methods for Modulators of Methyltransferase Activity are disclosed in Patent WO 01/94621. The screening methods in this patent relate to modulators of murine SUV39H2-methyltransferase

Screening for Modulators of Suv39h2 Mtase Activity (from WO 01/94621).

All steps are automated and the position of the different compounds being tested are registered on computer for later reference. Compounds being tested for modulating activity are aliquoted into 384 well plates in duplicate. 20-200 nmol of recombinant GST tagged human SUV39H2 in MAB buffer, is then added to the reaction. 20 nmol of branched peptide ([TARKST]₄-K₂-K-cys) which has been labelled with europium is then added, followed by 100 nmol of S-adenosyl methionine. This reaction is left at room temperature for 40 mins, then transferred onto a second plate to which the α-methH3-K9 antibody has been coated. This reaction is then left at room temperature for 40 mins to allow the antibody to bind methylated substrate. Following capture of methylated substrate, unbound non-methylated substrate is washed off in 50 mM tris pH 8.5. The europium label is then cleaved from the peptide in 50 μl pH 4.5 enhancement solution for 25 mins. The chelated europium molecules are then excited at 360 nm and the level of emitted fluorescence at 620 nm is then calculated using time-resolved fluorescence in a PolarStar plate reader. The results are then automatically graphed.

The level of fluorescence is directly related to the level of MTase activity. The effect of the different compounds on the MTase activity can be clearly seen on the graph when compared to control reactions with no compounds added or with no enzyme added. The principle of the screening method is as follows:

-   a) Suv39h2 is incubated with S-Adenosyl Methionine (SAM) and a     chromogenically labelled unmodified peptide substrate (e.g. branched     peptide [TARKST]4-K2-K-cys). Following methylation of this substrate     the substrate becomes an epitope for a Lys9-methyl specific antibody     which has been immobilised on a microtiter plate. The level of bound     peptide can then be quantified by the level of fluorescence of from     the chromogenic label. -   b) In the presence of a modulator (e.g. an inhibitor, I) the     transfer of methyl groups by the MTase will be affected (decreased),     this in turn will affect the amount of substrate captured by the     immobilised antibody, which is quantified by the level of     fluorescence. A compound with inhibitory effects will result in a     decrease in fluorescence signal, whereas a compound with inhibitory     effects will result in a decrease in fluorescent signal, whereas a     compound with enhancing effects will result in an increase in     fluorescent signal.

A truncated SUV39H1 (82-412), without the Chromo domain, was amplified by PCR from a Jurkat cDNA library and cloned into pGEX-2T. Histone methyl transferase activity of truncated protein was confirmed by radioactive assay.

Cloning of cDNAs into bacterial expression plasmids

All genes, CLLD7, CLLD8 and ANGE, were successfully amplified from I.M.A.G.E. consortium clones and cloned into the appropriate vectors: pET28a, pGEx4T and pGEx6P.

Expression of Proteins in Mammalian Cells

Full-length cDNAs for CLLD7, CLLD8, and ANGE 1 were cloned into pcDNA4His/Max plasmid (Invitrogen) for use in over-expression studies in B cells. Whole cell lysates were prepared 24 hours post-transfection and protein expression was detected by western blotting using an anti-polyhistidine monoclonal antibody. The bands that were detected in the western blot migrated at approximately the expected sizes for the recombinant proteins; CLLD7, 58kDa; CLLD8, 82 kDa; REN34, 22kDa; ANGE, 37 kDa (FIG. 11). This indicates that pcDNA4His/Max constructs are functional and express the genes of interest. Confirmation of these results was obtained by probing transfected Cos-7 cell lysates with an antibody to the Express Epitope (Invitrogen) for detection of ANGE and with specific antibodies to CLLD8 and ANGE.

IgE Reporter Assays

The human germline IgE promoter was cloned into the luciferase reporter vector pGL-2 Basic (Promega) in both forward and reverse orientations. pGL-2 Control, in which the luciferase gene is under the control of the SV40 promoter (Promega), was used as a positive control for luciferase activity. For normalisation of data due to differences in transfection efficiency, cells were co-transfected with pRL-TK (Promega).

The effect of over-expression of CLLD7 on IgE promoter activity was measured. Cells were co-transfected with the IgE reporter construct +/− pcDNA4-CLLD7. Luciferase activity was measured 24-hours post-transfection and IgE promoter activity expressed as a ratio of firefly luciferase to renilla luciferase activity.

Histone Methyltransferase Assays

Histone methyltransferases (HMT) are chromatin-modifying enzymes functioning mainly in the nucleus but could have a role in the cytoplasmic compartment. Therefore, both nuclear and cytoplasmic extracts were purified from transfected Cos7 cells, for use in the HMT assay. Human SUV39H1 (used as a positive control in the assay) is a Lysine specific Histone methyltransferase that specifically methylates lysine 9 of Histone H3. Methylation of H3 has been shown to recruit HP1 protein resulting in formation of heterochromatin and may thus be involved in gene regulation/gene silencing (Rea et al.).

Nuclear extracts from Cos7 cells transfected with pcDNA CLLD8 were tested for HMT activity. In experiments transfected Cos cells showed increased HMT activity relative to the untransfected control (see FIG. 12).

EXAMPLE 6

Identification of SNPs that alter the function of a gene is of importance when the SNP is in a genomic region, which is known to be associated with the disease. This example describes one approach to functionally validate a SNP in a putative gene regulatory region such as a promoter or enhancer. Alterations in the normal function of these genomic regions influences the levels of transcription of the gene potentially affecting the levels of protein, the consequence of which leads to a disease state. The ability of proteins to bind to DNA with or without the SNP was demonstrated by electromobility shift assays (EMSAs). An alteration in the shifting pattern or intensity when comparing the wildtype and mutant DNA is indicative of differential binding of transcription factors. This differential effect potentially will result in differing expression patterns of the gene.

SNPs Tested

Four SNPs were identified by in-silico analysis using DB and Celera SNP databases which lay within predicted regions of the CLLD7, CLLD8 and ANGE gene promoters. The description of the SNP, oligonucleotide probes and potential transcription factor binding sites are shown in Table 7

Labelling of Oligos

The DNA target was produced by 3′ biotin end labelling of complementary oligos with or without the SNP then incubated together to allow annealing to form a double stranded target DNA.

EMSA

Optimisation of double stranded oligo probe and nuclear extract was performed as demonstrated in the table. After incubation at room temperature, the samples were run on 5% polyacrylamide gels, electroblotted onto Biodyne charged membrane and the position of the biotin labelled oligos identified using the LightShift chemiluminescent EMSA kit (Pierce) as described by the manufacturers instructions and visualised by exposure to film or CCD camera.

Results

NR1 Oligos, hcv9873896, FIG. 13 a

The results showed a difference in protein binding to the wild-type and the mutant oligos for NR1. There was an increase in the migration distance for the DNA-protein complex with the mutant oligo. This is possibly due to a decrease in affinity for the mutant oligo, the protein could be dissociating or it is possible that a different protein from the nuclear extract could be binding to the DNA.

NR2 Oligos, clld7x1a295t, FIG. 13 b

There was protein-DNA complex formation for both the wild-type and the mutant oligos but there did not appear to be any difference between the two.

NR3 Oligos, clld7prom1a351g, FIG. 13 c

There was evidence of protein binding with the NR3 wild-type oligo but there was no binding detected with the mutant oligo.

NR4 Oligos, clld8x1a384g, FIG. 13 d

There was a very strong affinity for the protein by both the wild-type and the mutant oligos. There did not appear to be any difference in the affinity between the wild-type and the mutant oligo.

EXAMPLE 6a

In order to identify the cellular localisation of the CLLD7, CLLD8 and ANGE proteins we generated mammalian expression constructs which when transfected into cells have the ability to express the recombinant protein. Using antibodies specific to an epitope tag fused in frame with the protein of interest allows for visualisation of the protein within the cell.

Expression Cloning

The open reading frames of CLLD7 (1590 bp), CLLD8 (2157 bp) and ANGE (993 bp) were generated by PCR using full-length EST clones and the PCR product cloned into pcDNA-His.Max-TOPO (Invitrogen).

The frame and integrity of the orf was verified by double stranded sequencing of the insert. The plasmid expression construct was grown in bulk and purified ready for transfection of COS-7 cells.

Transfection and Analysis

Each purified plasmid clone was then transfected into the COS-7 cells using Fugene transfection reagent (Roche). Prior to transfection the COS-7 cells were cultured on cover slips overnight at 0.5×10⁵ per well in a 24 well plate. Cover slips were then fixed and labelled with Xpress antibody (Invitrogen) and detected using the anti-mouse Cy3 antibody. Images were obtained by immunofluorescence microscopy (FIG. 14 a, b, c).

REFERENCES

-   1. Jarvis, D. & Burney, P. ABC of allergies. The epidemiology of     allergic disease [published erratum appears in BMJ 1998 Apr. 4;     316(7137):1078]. British Medical Journal 316, 607-10 (1998). -   2. Eiberg, H., Lind, P., Mohr, J. & Nielsen, L. S. Linkage     relationship between the human immunoglobulin E polymorphism and     marker systems. Cytogenetics. And Cell Genetics 40, 622 (1985). -   3. Daniels, S. E. et al. A genome-wide search for quantitative trait     loci underlying asthma. Nature 383, 247-50 (1996). -   4. Anderson, G. G., Leaves N. I, Bhattacharyya S., Zhang Y., Walshe     V., Broxholme J., Abecasis G., Levy E., Zimmer M., Cox R., Cookson     W.O.C.M. Positive association to IgE levels and a physical map of     the 13q14 atopy locus. Eur J Hum Genet (in press) (2002). -   5. Cookson, W. The alliance of genes and environment in asthma and     allergy. Nature 402, B5-11 (1999). -   6. O'Connor, G. T. & Weiss, S. T. Clinical and symptom measures. Am     J Respir Crit. Care Med 149, S21-8 (1994). -   7. Duffy, D. L., Martin, N. G., Battistutta, D., Hopper, J. L. &     Mathews, J. D. Genetics of asthma and hay fever in Australian twins.     Am Rev Respir Dis 142, 1351-8 (1990). -   8. Gerrard, J., Rao, D. & Morton, N. A genetic study of     immunoglobulin E. Am J Hum Genet. 30, 46-58 (1978). -   9. Palmer, L. J. et al. Independent inheritance of serum     immunoglobulin E concentrations and airway responsiveness. Am J     Respir Crit Care Med 161, 1836-43 (2000). -   10. Risch, N. J. & Zhang, H. Mapping quantitative trait loci with     extreme discordant sib pairs: sampling considerations. Am J Hum     Genet. 58, 836-43 (1996). -   11. Cookson, W. & Palmer, L. Investigating the asthma phenotype.     Clin Exp Allergy 28 Suppl 1, 88-9; discussion 108-10 (1998). -   12. Dizier, M. H. et al. Detection of a recessive major gene for     high IgE levels acting independently of specific response to     allergens. Genet Epidemiol 12, 93-105 (1995). -   13. Kimura, K. et al. Linkage and association of atopic asthma to     markers on chromosome 13 in the Japanese population. Hum Mol Genet.     8, 1487-90 (1999). -   14. Ober, C. et al. Genome-wide search for asthma susceptibility     loci in a founder population. The Collaborative Study on the     Genetics of Asthma. Hum Mol Genet. 7, 1393-8 (1998). -   15. Hizawa, N. et al. Genetic regulation of Dermatophagoides     pteronyssinus-specific IgE responsiveness: a genome-wide multipoint     linkage analysis in families recruited through 2 asthmatic sibs.     Collaborative Study on the Genetics of Asthma (CSGA). J Allergy Clin     Immunol 102, 436-42 (1998). -   16. Beyer K, W. U., Freidhoff L, Nickel R, Björksten B, Huang S,     Barnes KC, Beaty T, Marsh DG. Evidence for linkage of chromosome     5q31-q33 and 13q12-q14 markers to atopic dermatitis. J Allergy Clin     Immunol 101, 152 (1998). -   17. Bhattacharyya, S., Leaves, N. I., Wiltshire, S., Cox, R. &     Cookson, W. O. A high-density genetic map of the chromosome 13q14′     atopy locus. Genomics 70, 286-91 (2000). -   18. Abecasis, G. R. et al. Extent and Distribution of Linkage     Disequilibrium in Three Genomic Regions. Am J Hum Genet. 68, 191-7     (2001). -   19. Abecasis, G. R., Cookson, W. O. & Cardon, L. R. The power to     detect linkage disequilibrium with quantitative traits in selected     samples. Am J Hum Genet 68, 1463-74 (2001). -   20. Oscier, D. G. Cytogenetic and molecular abnormalities in chronic     lymphocytic leukaemia. Blood Rev 8, 88-97. (1994). -   21. Kalachikov, S. et al. Cloning and gene mapping of the chromosome     13q14 region deleted in chronic lymphocytic leukaemia. Genomics 42,     369-77 (1997). -   22. Mabuchi, H. et al. Cloning and characterisation of CLLD6, CLLD7,     and CLLD8, novel candidate genes for leukemogenesis at chromosome     13q14, a region commonly deleted in B-cell chronic lymphocytic     leukaemia. Cancer Res 61, 2870-7. (2001). -   23. Bentley, D. R. et al. The physical maps for sequencing human     chromosomes 1, 6, 9, 10, 13, 20 and X. Nature 409, 942-3. (2001). -   24. McPherson, J. D. et al. A physical map of the human genome.     Nature 409, 934-41. (2001). -   25. Osoegawa, K. et al. A bacterial artificial chromosome library     for sequencing the complete human genome. Genome Res 11, 483-96     (2001). -   26. Lander, E. S. et al. Initial sequencing and analysis of the     human genome. Nature 409, 860-921 (2001). -   27. Abecasis, G. R., Cherny, S. S., Cookson, W. O. & Cardon, L. R.     Merlin—rapid analysis of dense genetic maps using sparse gene flow     trees. Nat Genet 30, 97-101 (2002). -   28. Abecasis, G. R. & Cookson, W. O. GOLD—graphical overview of     linkage disequilibrium. Bioinformatics 16, 182-3 (2000). -   29. Abecasis, G. R., Cardon, L. R. & Cookson, W. O. A general test     of association for quantitative traits in nuclear families. Am J Hum     Genet 66, 279-92 (2000). -   30. Cox, H. E. et al. Association of atopic dermatitis to the beta     subunit of the high affinity immunoglobulin E receptor [see     comments]. Br J Dermatol 138, 182-7 (1998). -   31. Monks, S. A., Kaplan, N. L. & Weir, B. S. A comparative study of     sibship tests of linkage and/or association. Am J Hum Genet. 63,     1507-16. (1998). -   32. Takeda, S. et al. Isolation and mapping of karyopherin alpha 3     (KPNA3), a human gene that is highly homologous to genes encoding     Xenopus importin, yeast SRP1 and human RCH1. Cytogenet Cell Genet.     76, 87-93 (1997). -   33. Ohki, I., Shimotake, N., Fujita, N., Nakao, M. & Shirakawa, M.     Solution structure of the methyl-CpG-binding domain of the     methylation-dependent transcriptional repressor MBD1. Embo J 18,     6653-61 (1999). -   34. Wakefield, R. I. et al. The solution structure of the domain     from MeCP2 that binds to methylated DNA. J Mol Biol 291, 1055-65     (1999). -   35. Rea, S. et al. Regulation of chromatin structure by     site-specific histone H3 methyltransferases. Nature 406, 593-9     (2000). -   36. Jenuwein, T. Re-SET-ting heterochromatin by histone     methyltransferases. Trends Cell Biol 11, 266-73 (2001). -   37. Nakayama, J., Rice, J. C., Strahl, B. D., Allis, C. D. &     Grewal, S. I. Role of histone H3 lysine 9 methylation in epigenetic     control of heterochromatin assembly. Science 292, 110-3 (2001). -   38. Scanlan, M. J. et al. Antigens recognised by autologous antibody     in patients with renal-cell carcinoma. Int J Cancer 83, 456-64     (1999). -   39. Aasland, R., Gibson, T. J. & Stewart, A. F. The PHD finger:     implications for chromatin-mediated transcriptional regulation.     Trends Biochem Sci 20, 56-9 (1995). -   40. Angioni, A. et al. Interstitial insertion of AF10 into the ALL1     gene in a case of infant acute lymphoblastic leukaemia. Cancer Genet     Cytogenet 107, 107-10 (1998). -   41. Linder, B. et al. Biochemical analyses of the AF10 protein: the     extended LAP/PHD-finger mediates oligomerisation. J Mol Biol 299,     369-78 (2000). -   42. Fair, K. et al. Protein interactions of the MLL PHD fingers     modulate MLL target gene regulation in human cells. Mol Cell Biol     21, 3589-97 (2001). -   43. Miyake, T., Hu, Y. F., Yu, D. S. & Li; R. A functional     comparison of BRCA1 C-terminal domains in transcription activation     and chromatin remodelling. J Biol Chem 275, 40169-73 (2000). -   44. Nemergut, M. E., Mizzen, C. A., Stukenberg, T., Allis, C. D. &     Macara, I. G. Chromatin docking and exchange activity enhancement of     RCC1 by histones H2A and H2B. Science 292, 1540-3 (2001). -   45. Lavender, P., Cousins, D., Smith, P. & Lee, T. Presentation at     the National Asthma Campaign International Congress, June 1999.     Controlling the inflammatory response through transcriptional     mechanisms. Clin Exp Allergy 30, 1697-708 (2000). -   46. Garlisi, C. G. et al. A unique mRNA initiated within a middle     intron of WHSC1/MMSET encodes a DNA binding protein that suppresses     human IL-5 transcription. Am J Respir Cell Mol Biol 24, 90-8 (2001). -   47. Chesi, M. et al. The t(4;14) translocation in myeloma     dysregulates both FGFR3 and a novel gene, MMSEF, resulting in     IgH/MMSET hybrid transcripts. Blood 92, 3025-34 (1998). -   48. Walker, W., Girardet, C. & Habener, J. Alternative exon splicing     controls a translational switch from activator to repressor isoforms     of transcription factor CREB during spermatogenesis. J Biol Chem     271, 20145-50 (1996). -   49. Flint, J. et al. Comparative genome analysis delimits a     chromosomal domain and identifies key regulatory elements in the     alpha globin cluster. Hum Mol Genet 10, 371-82. (2001). -   50. Terwilliger, J. D. & Weiss, K. M. Linkage disequilibrium mapping     of complex disease: fantasy or reality? Curr Opin Biotechnol 9,     578-94. (1998). -   51. Hill, M. R. et al. Fc epsilon RI-beta polymorphism and risk of     atopy in a general population sample. BMJ 311, 776-9 (1995). -   52. Jurka, J. Repbase update: a database and an electronic journal     of repetitive elements. Trends Genet 16, 418-20 (2000). -   53. Larsen, F., Gundersen, G., Lopez, R. & Prydz, H. CpG islands as     gene markers in the human genome. Genomics 13, 1095-107 (1992). -   54. Prestridge, D. S. Predicting Pol II promoter sequences using     transcription factor binding sites. J Mol Biol 249, 923-32 (1995). -   55. Xu, Y., Mural, R. J. & Uberbacher, E. C. Constructing gene     models from accurately predicted exons: an application of dynamic     programming. Comput Appl Biosci 10, 613-23 (1994). -   56. Burge, C. & Karlin, S. Prediction of complete gene structures in     human genomic DNA. J Mol Biol 268, 78-94 (1997). -   57. Snyder, E. E. & Stormo, G. D. Identification of coding regions     in genomic DNA sequences: an application of dynamic programming and     neural networks. Nucleic Acids Res 21, 607-13 (1993). -   58. Zhang, M. Q. Identification of protein coding regions in the     human genome by quadratic discriminant analysis. Proc Nail Acad Sci     USA 94, 565-8. (1997). -   59. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new     generation of protein database search programs. Nucleic Acids Res     25, 3389402 (1997). -   60. Schultz, J., Copley, R. R., Doerks, T., Ponting, C. P. &     Bork, P. SMART: a web-based tool for the study of genetically mobile     domains. Nucleic Acids Res 28, 2314 (2000).

TABLE 1 Association between Qts and SNPs P Values from QTDT LnIgE STI LnIgE STI LnIgE STI STI as lnIgE as 4_2 as 4_2 as 38_1 as 38_1 as MARKER POSITION LnIgE PSTI covariate covariate covariate covariate covariate covariate a) BAC BA101D11 101283b16_1 449 101283b16_2 685 0.0330 0.0240 0.0350 b) BAC BA103J18.03548 101660b13 1 155075 101063b15 1 164307 101458b10 1 182579 101094b11 2 231215 0.0015 0.0420 185316b2 1 297898 185316b2 2 297860 185316b1 2 300838 0.0076 0.0360 0.0120 185316b1 1 301185 0.0017 0.0140 0.0026 185306b7 3 317115 0.0340 0.0220 0.0313 185306b7 2 318304 0.0093 0.0110 185306b7 1 318308 0.0040 0.0380 0.0042 185752b4 1 338162 0.0290 185752b4 2 338679 0.0015 0.0032 0.0002 0.0350 185752b4 3 339505 0.0034 0.0074 0.0028 154016 5S 339679 0.0430 0.0061 185752b5 1 345206 0.0029 0.0210 0.0490 185752b5 2 345771 0.0005 0.0480 0.0040 0.0220 0.0050 185752b5 3 346362 0.0029 0.0210 0.0006 185752b6 1 350507 0.0041 0.0240 0.0026 44593 15 350770 0.0015 0.0130 0.0073 185752b6 3 351465 0.0025 0.0150 0.0034 185752b6 4 351804 0.0053 0.0240 0.0041 432343b33 2 399191 0.0075 0.0089 432343b33 1 399717 0.0140 432343b32 2 401694 154016 9R 346769 154016 1S 313625 154016 4S 335681 154016 4R 335560 154016 4R2 335668 154016 4Ins 335679 154016 5Y 338728 154016_1A.1Y 314883 154016_1A.2Y 314849 154016_5S 399680 432343b32 1 402491 0.0210 c) BAC bA236M15.0030 4321031b43 30907 0.0200 0.0062 0.0170 0.0250 4321031b43 31147 0.0073 0.0140 0.0170 0.0480 1895799 1 76790 143317 1 81615 0.0120 0.0240 0.0150 4321017b38 108259 0.0090 0.0140 0.0150 4321017b38 109085 0.0021 0.0060 0.0048 195226b35 1 183672 0.0031 0.0046 0.0056 1951054b37 197991 1951054b37 198562 236969b17 1 227369 0.0070 0.0140 0.0100 236717b28 1 265325 236717b28 2 265725 0.0310 0.0250 0.0290 2361170b26 364495 0.0230 0.0320 2361170b26 364858 626789 3 128859 626789 1 132421 626789 2 129231 1895799 3 97972 5133822 2 27156 1895799 1 76790 143317 1 81615 5133822 1 27143 1895799 2 97969

TABLE 2A SNP/Marker Associations with Total Serum IgE (LnIgE) SNP NAME P (QTDT) MARKER (Cross Ref to Table 1) BASE CHANGE POSITION LnIgE 101063b15_1 101063B15.1 C/G 164308 101458b10_1 101458B10.1 A/G 182663 101094b11_2 101094B11.2 A/G 231215 d8ex7 d8ex7 A/G 294172 0.018 185316b2_2 185316B2.2 A/G 297860 185316b2_1 185316B2.1 G/T 297898 NS 185316b1_2 185316B1.2 C/G 300838 0.0076 185316b1_1 185316b1.1 A/G 301185 0.0017 d8in12 d8in12 C/T 305881 d8in13 d81n13 A/G 308393 185306b7_3 185306B7.3 C/G 317115 0.034 185306b7_2 185306B7.2 A/G 318304 0.0093 185306b7_1 185306B7.1 A/C 318308 0.004 154016_2R 154016_2 A/G 324399 0.016 185752b4_1 185752b4.1 G/T 338162 185752b4_2 185752B4.2 A/G 338770 0.0015 154016_in5r 154016in5 A/G 338997 0.0043 185752b4_3 185752B4.3 C/G 339509 0.0034 154016_5S i154016.5 C/G 339679 0.043 Angein7 angein7 C/T 341533 0.003 185752b5_1 185752B5.1 A/T 345206 0.0029 185752b5_2 185752B5.2 C/T 345743 0.0005 154016_319 185752B5.3 A/G 346363 0.0011 185752b6_1 185752B6.1 A/G 350507 0.0041 44593_15 44593_15 GAAAGATGATAAAGT 350770 0.0015 185752b6_3 185752B6.3 A/G 351465 0.0025 185752b6_4 185752B6.4 C/T 351804 0.0053 432343b33_2 432343B33.2 C/T 399283 0.0075 432343b33_1 432343B33.1 A/G 399823 432343b32_2 432343B32.2 C/T 401694 432343b32_1 432343B32.1 A/G 402491 5133822_2 i5133822.2 G/T 427516 4321031b43_1 4321031B43.1 A/G 430907 0.02 4321031b43_2 4321031B43.2 A/G 431147 18957999_1 i1895799.1 C/T 476787 143317_1 i1433174.1 A/C 481729 4321017b38_3 4321017B38.3 C/T 508259 4321017b38_1 4321017B38.1 A/G 509085 626789_3 i626789.3 C/T 528859 NS 195226b35_1 195226635.1 A/G 583672 1951054b37_1 1951054B37.1 C/T 597991 1951054b37_2 1951054B37.2 C/G 598562 236969b17_1 236969817.1 A/G 627368 236717b28_1 236717B28.1 G/T 665326 236717b28_2 236717B28.2 C/T 665725 2361170b26_2 2361170B26.2 C/T 764495 2361170b26_1 2361170B26.1 C/G 764858

TABLE 2B Association of common b4_2.b5_3.b43_1 haplotypes to total IgE in subject panels AUS1 AUS2 UK2 ECZ* Haplotype freq p (mul) p (ind) n (freq) p (mul) P (ind) freq p (mul) p (ind) freq p (mul) p (ind) A 2.1.1 0.449 0.011 0.014 0.396 0.12 0.078 0.378 0.0039 0.0153 0.294 0.0012 0.0001 B 1.21 0.22 ** 0.229 ** 0.198 ** 0.177 ** C 1.2.2 0.193 ** 0.238 ** 0.192 ** 0.234 0.0019 D 2.1.2 0.084 0.005 0.077 0.036 0.09 0.0012 0.113 ** Haplotype n = 296 n = 533 n = 323 n = 531 *RAST index included as a covariate

TABLE 2C

TABLE 3 Full length cDNAs isolated from region Position Name in Contig* Homology: Function CLLD8 294727- Methyl CpG binding domain and 309803 a SET domain. Epigenetic modulation of gene expression by histone H3 methylation ANGE (NY- 313649- PHD zinc finger domains: nuclear protein REN-34) 346509 binding (possibly BRCA1) CLLD7 349634- RLG and RCC1: cell cycle regulation by 410846 chromatin remodelling; histone H2A and 2B docking. Emopamil - 467557- D8-D7 sterol isomerase binding related 498344 protein (EBRP) Karyopherin-∀ 506187- Nuclear transport protein 3 (KPNA3) 598852 CLLD6 721749- SPRY domain: potential microtubule-binding 734946 *The position is in base pairs from the beginning of the reference sequence.

TABLE 4A SNPs identified in the region Base SNP MARKER Change Position SNP NAME (Cross ref to Table 2) C/G 3706 101063B15.1 101063b15_1 A/G 20722 101458B9.1 A/G 22872 101458B10.1 101458b10_1 C/T 23748 101458B10.2 A/G 71508 101094B11.2 101094b11_2 A/C 71903 101094B11.1 A/G 78071 1011417B14.2 A/C 78790 1011417B14.1 A/G 102686 CLLD8_X1_A384G G/T 102687 d8ex1 A/G 108716 d8in1a A/T 109340 d8in1b A/C 118529 d8in1c C/T 118887 d8in3b C/T 118893 d8in3c C/G 118996 d8in3d A/G 119053 CLLD8_X4_G28A G/T 121758 d8in4a C/T 121801 d8in4b A/G 121925 d8in4 A/G 127665 d8in6a A/T 129915 d8in6b A/G 132497 d8in6d A/G 132820 d8in6e A/G 134465 d8ex7 d8ex7 A/G 138153 185316B2.2 185316b2_2 G/T 138191 185316B2.1 185316b2_1 C/T 138771 d8in8 A/T 140748 d8ex101 A/G 140942 d8ex10 A/T 140944 8ex101 C/G 141131 185316B1.2 185316b1_2 A/G 141132 d8in9 A/G 141478 185316b1.1 185316b1_1 A/G 142531 d8in11c A/G 142712 d8in11d A/C 143199 d8in11e C/T 143376 d8in11f A/G 144314 d8in12a A/T 145955 d8in12b A/G 146018 d8in12c C/T 146172 d8in12 d8in12 A/G 147356 d8in13a A/G 148684 d81n13 d8in13 C/G 151091 d8in15b C/G 153918 i154016.1 C/T 155126 154016_1A.1 C/G 157408 185308B7.3 185306b7_3 A/G 158149 angein1a A/G 158597 185306B7.2 185306b7_2 A/C 158601 185306B7.1 185306b7_1 G/T 158996 angein1b A/G 164692 154016_2 154016_2R C/T 170985 angein2 C/T 170987 i154016.2 C/T 171079 ANGE1X3C148T A/G 171373 ANGE1rs2147985 C/G 175937 ANGE1X4C46G A/G 175961 154016_4 C/T 175964 ANGE1X4C76T ACTC/— 175972 154016_4 C/G 175974 i154016.4 G/T 178453 185752b4.1 185752b4_1 A/G 178972 185752B4.2 185752b4_2 C/T 179021 154016_5 A/G 179288 154016in5 154016_in5r C/G 179798 185752B4.3 185752b4_3 C/G 179972 i154016.5 154016_5S A/G 181343 i154016.6 C/T 181824 angein7 Angein7 A/T 182157 i154016.7 (TT)/— 184474 ANGE1X9_2_base_deletion_from_position_223 (TTAT)/— 184486 ANGE1X9_4_base_deletion_starting_from_position 234 C/T 184492 ANGE1X9C241T A/G 184512 ANGE1G261A C/T 184592 ANGE1X9C341T C/T 184595 ANGE1X9C344T C/T 184601 ANGE1X9T350C C/G 184602 ANGE1X9G351C C/T 184627 ANGE1X9C376T A/G 184743 ANGE1X9A492G C/T 184916 angein9a A/G 184937 angein9b A/T 185499 185752B5.1 185752b5_1 C/T 186064 185752B5.2 185752b5_2 A/G 186655 185752B5.3 154016_319 C/T 186835 angein10 A/G 187062 i154016.9 TCTTTA/— 190178 6_BASE_INSERTION_STARTING_AT_POS_371 AGATAA/— 190181 44593_6 A/T 190246 d7ex13a T/— 190396 1_BASE_DELETION_AT_POS_153 A/G 190429 CLLD7_X13.4_C120T A/C 190656 CLLD7_X13.3_RS1046034 A/G 190800 185752B6.1 185752b6_1 GAAAGATGATAAAGT 191055 44593_15 44593_15 G/T 191122 d7ex13b C/T 191134 CLLD7X13.2_RS1046028 C/T 191168 d7ex13c. C/G 191220 185752B6.2 A/G 191749 185752B6.3 185752b6_3 A/T 191763 d7ex13e C/G 191818 CLLD7_X13_C499G C/T 192097 185752B6.4 185752b6_A A/G 192200 d7ex13f A/T 198826 d7in12 C/T 199514 CLLD7_RS1536195 C/G 199779 d7ex11 C/G 207467 d7ex9a A/G 207470 d7ex9b A/G 207494 d7ex9c G/T 207533 d7ex9d A/G 210227 CLLD7_X7_RS2274278 C/T 210512 CLLD7_RS_2274281 C/T 213444 d7in6 A/G 213460 CLLD7_X6_RS2274284 C/T 217944 d7ex5a A/G 217995 d7ex5b C/T 225121 d3in3 A/G 225190 d3ex3a A/G 225207 d3ex3b C/T 239484 432343B33.2 432343b33_2 A/G 240010 432343B33.1 432343b33_1 C/T 241987 432343B32.2 432343b32_2 A/G 242784 432343B32.1 432343b32_1 A/T 243337 CLLD7_X1A295T C/T 243831 CLLD7_PROM_1_A351G A/T 266607 432363B41.1 A/G 278239 i5133822.1 G/T 278252 i5133822.2 5133822_2 A/G 282003 4321031B43.1 4321031b43_1 A/G 282243 4321031B43.2 4321031b43_2 C/T 283853 4321031B44.1 C/T 327886 i1895799.1 1895799_1 A/C 332711 i1433174.1 143317.1 C/T 349070 i1895799.2 A/C 349073 i1895799.3 C/T 357678 4321017B38.4 C/T 359360 4321017B38.3 4321017b38_3 C/G 359691 4321017B38.2 A/G 360186 4321017B38.1 4321017b38_1 A/G 360524 i626789.4 C/T 379960 i626789.3 626789_3 A/T 380332 i626789.2 C/T 383520 626789.1 C/T 383522 i626789.1 C/G 407505 1951336b18.1 A/G 434774 195226B35.1 195226b35_1 G/T 443990 195226B34.1 C/T 449093 1951054B37.1 1951054b37_1 C/G 449664 1951054B37.2 1951054b37_2 A/G 478470 236969B17.1 236969b17_1 G/T 516429 236717B28.1 236717b28_1 C/T 516829 236717B28.2 236717b28_2 A/G 549422 236357B29.1 A/C 593629 236618B30.1 C/T 615612 2361170B26.2 2361170b26_2 C/G 615975 2361170B26.1 2361170b26_1

TABLE 4B Primer No Marker Position SNP F primer R primer Modified enzyme  1 101660b13_1 155075 A/T ATGTGGGCTGGAGTTGTG CAAGTGTGTGACATATATC no MnII  2 101063b15_1 164308 G/C GCTCTTAATTCCCTATTGC CTGGAGGAATTCTACCCC no AvrI  3 101458b10_1 182663 A/G ACCCAACGTAATCTACAGGTGCA GTCAAGTCATGTGATTTCTTC yes BtsI  4 101283b16_1 183384 T/C GCATTTCTCCAGGGCTTTG ACTGCCTGAGGACCAAAC no BsaAI  5 101283b16_2 183563 C/T CAGAGGCTGGGAGAGAG CGTTGGGAGGGAAGTCAG no HphI  6 101094b11_2 231215 C/T CCAAGTGTACAGGGAATAG AATGAAAGTAGAAAAGGGCC yes HpaII  7 185316b2_2 297860 C/T GTTATTTTGTCACTTTGACAAG ACATTTATCAAGACAGTCTC no MseI  8 185316b2_1 297898 C/A GTTATTTTTGTCACTTGACAAG ACATTTATCAAGACAGTCTC no Cac8I  9 185316b1_2 300838 G/C GGTCCCAACTACATTTTGC GCTTACTGTGTAAGTAACAG no BsrI 10 185316b1_1 301185 T/C ATCTCTAATAACTGAATGGTA AACATAGATGTTCTTACTTAG yes RsaI 11 185306b7_3 317115 G/C AATAGGAATAAGGATGTGAGT GAAATCTTCTATCCTGAATTC yes HinfI 12 185306b7_2 318304 T/C AAGCGATACTATCTGACAG AATCTGTACAGTTCCTGAG no MseI 13 185306b7_1 318308 T/G TAAATCCCCCCATATCGCAG TGAGAGATGCCTGTCTGTG yes BtsI 14 185752b4_1 338162 T/G CACTTCATTTCTTCATCATAG GCAAGCTTAATTTAAGGAGA yes BsmAI 15 185752b4_2 338770 A/G GTTITAAAGAAAGGTAAACAATT CAAAGAGAATAGGTCTTGTTC yes MfeI 16 185752b4_3 339509 C/G CACCAATTATATTTGTTTTCC GGACGATTTGGAAAGTACC no NspI 17 154016_5S 339479 G/C CCTTTATCATAAGTGCTGC AAGAGCTTCCCGACTATG no HpyCH4III 18 185752b5_1 345206 A/T GAGTGAGTGCCCTCC GGTTATAAATGTAGTAAAGAA yes EcoRI 19 185752b5_2 345743 C/T CTAAGTTTCCACAATTAAAAG GTCCCTTTGTATATCTTTGAG no AluI 20 185752b5_3 346362 G/A GCATCTCAACAAAGGTGGC CATGACTCTTGGCTTAGGAGA yes BsmAI 21 154016_319 346363 G/A TGACTCTTGGCTTAGGACA CATGTCAACAAAGGTGGC yes NlaIII 22 185752b6_1 350507 A/G CCCACTTGTAATTTGGCTC AAGACAGTAGTTTATTTGAAC no SplI 23 44593_15 350770 15 bp del GGAACACCATAATTTTGGTG TCTTCAAGTCAACCCTCTTA no — 24 185752b6_3 351465 A/G CAACTTTTCACGGATACTTC TACGGCTTTTCTTGAGCC no BsaAI 25 186752b6_4 351804 T/C CCCTTTTACATGTTCCAGC TTTTGTTGTGTGTCCAGGAT no MnII 26 432343b33_2 399283 A/G CTCAAGTCCCCAGATCTC CAGGTACCTGTGGGATATTAAA no AciI 27 432343b33_1 399823 T/C ACTTCTGCTTCAGATCCC ATAGTAATCACAGAATGGGAT yes FokI 28 432343b32_2 401649 A/G CAAAGCTGTCAAGCTTGTAG GAAGGGATATCATCATCATGA yes PshAI 29 43234332_1 402491 T/C GGCAGGGATTTTGTTAGTG AGTGTCTTACATGCAAATACAT yes NlaIII 30 5133822_2 427516 G/T TATGTAGGAATAAGGCAATG AGTTTCTTCAACAACCTGC yes PstI 31 4321031b43_1 430907 G/A TGATCCTCAACACAGCTC ACATTTTTTCCAGAGATATTGTA yes RsaI 32 4321031b43_2 433147 A/G CTCAAGTCCCCAGATCTC CAGGTACCTGTGGGATATTAAA no AcII 33 1895799_1 476787 C/T GAGGCGTTTGACAGTCAC TTAGGGTGAGACATACCC no BanI 34 143317_1 481792 C/A CAGAGAAAGACATGGGAG CTACTGTTTGTGCCACAG no BbcVI 35 4321017b38_3 508259 G/A GGAAGACTAAAGAGCTGC TTTATTCAATAACCAAGTTTC yes PstI 36 4321010b38_1 509085 T/C CTTCATTATAAGGTAAAGCTG TAACACGTTTCGAAACTGC yes Fnu4H 37 195226b35_1 583672 T/C GCTTTTTTTTGTCTTTTTAGATCA GTACACAAATTAAGTTATACAAG yes NiaIII 38 1951054b37_1 597991 C/T GTAAACAATGAAACAAAAACC GCAGCGCTAGAGGTCTC no MboII 39 1951054b37_2 598562 C/G GCAACTAACTGAGTGACC TCCGACCTCCGCTCCTA yes AvrII 40 236969817_1 627368 C/T CACATCTCATAATGGTATCAC GAAGAAGGGTATGCTTTTAGA yes MboII 41 236717b28_1 665326 T/G TTCTTGGTAAACATCTCTAC CCTGAACAGGCTGTACTC no EcoRI 42 236717b28_2 665725 T/C GATAGGAAAGATTTTTGGTCT CCGTTCTTAGTTTTGGGTC yes BsmAI 43 2361170b26_2 764495 A/G CTGGGCTACCTGTGGATCACT CTGTTTACAGTAGACATTCC yes BsaBI 44 2361170b26_1 764858 C/G CTAGGCAGCAGGGCACTC GGCACCTCACGTGGATG no ApaI 45 626789_3 128859 TCACACCTGCAATGAAATC GTTGACTCTCTTCTTGCC 46 154016_9R 346769 ATTCTACTTACAGCCCATG GTCCAATTTAAAATTTAGGCG 47 5133822_1  27143 TAAGTGCAGCGAACCCAG CACTTGGCTGCCTGGAC 48 626789_4 109423 ACTTACATCATCACCAGAG ATTTCTCCATGATTAAAATCAG 49 626789_1 123421 ACATCTTTTTAGACACATAAC TTGGCTAGCAGGCAATTG 50 626789_2 129231 TCCTCACCATTAGTCAAAG TTAATGGTATGATAGCAACC 51 1895799_2  97969 GTTTTGGTGGAGGAATCGG GTTTCTCTCCCTGCTTTC 52 1895799_3  97972 GTTTGGTGGAGGAATCGG GTTTCTCTCCCTGCTTC 53 154016_1S 313625 GACTAGACTGGAATCTGG CGGTACACACCGGTGG 54 154016_4S 335681 ATTCCCAAATGAGCAGCC CATAACCCCTGTGCATCC 55 154016_4R 335560 ATTCCCAAATGAGCAGCC CATAACCCCTGTGCATCC 56 154016_4R 2335668 ATTCCCAAATGAGCAGCC CATAACCCCTGTGCATCC 57 154016_4ins 335679 ATTCCCAAATGAGCAGCC CATAACCCCTGTGCATCC 58 154016_5Y 338728 GACATCTAACTGAGTCTCC TAGTCAGTCATATCACATAG 59 154016_1A.1Y 314833 CACAAAGGCACCAAACCAC1 ACCCTTACAGTCCATATGTA 60 154016_1A.2Y 314849 CACAAAGGCACCAAACCAC ACCCTTACAGTCCATATGTA

TABLE 4C SNP Exonic- Gene SNP (IUB amino acid name SNP name alias 5′ Flanking Sequence code) 3′ Flanking Sequence change CLLD8 CLLD8 X4 G28A CLD802 TCTAATGCAGATAGTCCTATTAACCTC R ATGCTGTTTTTCTGTCTTTAACAGAATAC TCAGAGGAATAGTGAAGATTAAATGAG ATCCAAGCAATGATTCTAGTGAATGAAGC ATAAAATATGTAAATTGTTTAGCACAG AACTATAATTAACAGTTCAACATCAATAA TAGCTGGCACCATAAATAAAGGTAATG AGGGTATGTACATCTCTATTCC GT CLLD8 CLLD8 X8 G91A CLD803 CTATTTAGGGAAAAAGCTGATAGTGTT R ACTTGCTGCTTTTTCATTCTTTCTCCATC CATGAGCCAAACTATTCTAAATGCTAT CATTGCCTGCCTTTTAGAACAAAATGTGC TACATCTTTTATTACATCATTTACACA ATGCTCTTCAACTGACAGCAAGGAATGCC TTTATCAAGACAGTCTCTTGACTT AAAACTTCCCCCTTGTCAAGTGACAAAAT CLLD8 CLLD8 X14 G40A CLD807 TTATAGGGATTTGTCCCATTTTTGCAT R ACAAAAAAAGAACAAAATCAGTGTGATAT TTCTGCCAGCAACATAGAAGAGCGTCA ATTAAAAACAGAATGGTTTTCTAATGCAA GAACAGGGCTTTTCTGAATGGGTTCCC AAGCACAAGGCATCAGACTTTATGCTATT ACTATTAACAGGAGCTGTGATAATTTA TCTAACCTTTGTGTTTATTTTTATTTAAC TACATT AGCATA CLLD8 CLLD8 X8 G129T ATGAGCCAAACTATTCTAAATGCTATT K CCTTTTAGAACAAAATGTGCATGTCTTCA ACATCTTTTATTACATCATTTACACAT ACTGACAGCAAGGAATGCCAAAACTTCCC TTATCAAGACAGTCTCTTGACTTGACT CCTTGTCAAGTGACAAAATAACCACTGGA TGCTGCTTTTTCATTCTTTCTCCATCC TATAAATATAAAAGACTACAGAGACA ATTGCCT CLLD8 CLLD8 X10 G321A AAAGAGGAAATTAGAAGTTGCATGTTC R TGTAAGTAACAGCTGAGGAACCCAGAGTA AGATTGTGAAGTTGAAGTTCTCCCATT AACTAAATTATTATCAATCAATTGGTTCT AGGATTGGAAACACATCCTAGAACTGC TTTTCATTCCTTCCCCTCTTTCTTTTCTC TAAAACTGAGAAATGTCCACCAAAGTT CTCATTTGTATTTATCATTTTGCTTCAAA CAGTAATAATCCCAAGGAGCTTACT CLLD8 CLLD8 X1 A384G AACGCCGCGGAGCCCTCGGCCGCCCGA R TTTCAGCCTGTCTGTCTCAGGGTGCAGCC GCAGGGGCTGGACCCCAGCCCTTGCAG TTAATGAGAGGTGATTCCTAAGCTGCTGG CCTCCCTTCTCCTGGCACCCAAGTGCA GAACCTGAGGTTGTCAAAGGGGCGGCAGG GTCCTGGCTGCAGAAGGGGCCGCGGGC AAATGGACAGCAGTATAAAACCCAGAAGC GCACTGAGTTTCCAACCTCC AGAAC CLLD8 CLLD8 X11 A180G GTAATAATGGATATTATCTGTACTTGA R TATCATTCAGTTATTAGAGATCCTGAATC GTAAAGAGAATGAAGAGAATTTTTAGT CAAGACAGCCATTTTTCAACACAATGGGA GATACAGAATAACTTTTTTTTCCTTTT AAAAAATGGTAAAAAATGCAAAATGTAGT TAGGGAAACGAAATATGATAATATTTC TGGGACCCTTCTTTTCTTTTTTAA AAGAATTCA ANGE1 ANGE1X2rs2031532 ANGE01 TCTCACCAGGGATATTACTTATAGGTG R TACTTTGCACAATCAGAGAATATAGCTGC TCTTTCAGGTTGCAGAAAAGATGGAAA TCATGAGAATTGTTTGGTAAGTTACTTGA AAAGGACATGTGCACTCTGCCCCAAAG AAACATACTTCAAAGTACATGAGTACTTT ATGTCGAATATAATGTCCT TAGTGTAGGACAC ANGE1 ANGE1X3rs2247119 ANGE02 ATGGTCTGCTCATCCACCCCCAAAGTT Y AGCTTTCAATATTACTTTGTGCATACT GCCACTGTTTTACCCCCAGTCCATTTC CTGGATTTTTTTTTCCAATCTGCAGCT CCACTCCTGGTTTGATTTTCCACTTTT GTATTCTTCAGGACTTGTGGAATGTGA GGTATGATTTTCTGACATG GGATCAGGATCCACTTAAT ANGE1 ANGE1X4 4 base ANGE08 GATAAAATGCAGGGCTTCTGGAACTTA (ACTC) AGTAAAATCTATTCTTCTGTAGAAATGCA delection from AATAACATTTAATGATATCTTTGAATT AATTTTGTCATAAAAGAGGAGCCACCGTG position 82-85 TTAACAAATAACAACAAACAGGTACT GGATGTGATTTAAAAAACTGTAACAAGAA TTACCACTTTTTCTGTGCCAAGAAGGACG ACGCAGTTCCACAGTCTGATGGAGTTCGA GGAATTTATAAGTATTTAATAAAACATTT TTAAAACCACATTTGGGGGATTGGGATAA GG ANGE1 ANGE1rs942870 ANGE11 AAGAACTTATCCATGTAACCTGAAACC Y AGTTATTGTAAGCAGAGAAATTTTCATTT ACCTGTTCCCCAAAAGGTATTGAAATT TTACTTTTTGTAACATTTTATTCATTGTT TAAAAAGTTACAGTTTTATATGTTTAA TCAGCTGGCCAAAAAAAATGAGGTATGAT AAAAGCCTCTTAATTCATA GAAATTATAATTA ANGE1 ANGE1X3C148T CTGACATGTAGCTTTCAATATTACTTT Y CACTTAATCCTGATAGAAGTTTTGATGTG P46S GTGCATACTCTGGATTTTTTTTTCCAA GAATCAGTAAAGAAAGAAATCCAGAGAGG TCTGCAGCTGTATTCTTCAGGACTTGT AAGGAAGTTGGTAAGTGTAAATGATGTTA GGAATGTGAGGATCAGGAT TTTCTTATACTGG ANGE1 ANGE1rs2147985 AACTGTATTGGAAACCTCAATTCAACT R CGGTCTCAGCTCACTGCAACCTCCGCCTC TAACATTTAATTTTTTTTTTTTTTTTT CTGGGTTCAAGCGATTCTCCTGCCTCAGC TTGAGACGGAGTTTTGCTCTTGTTGCC CTCCCGAGTAGCTAGGATTACAGGCACGG CAGGCTAGAGTGCAGTGAC GCCACCATGCTCG ANGE1 ANGE1X4C46G CAAATATAAGTACTGTACCCTAATGTT S ATGGAAAAAAGGACATGTGCACTCTGCCC TGTAACGGGATGGAATCTTGCTGATGA CAAAGATGTCGAATATAATGTCCTGTACT TGCCCAGGATAAAATGCAGGGCTTCTG TTGCACAATCAGAGAATATAGCTGCTCAT GAACTTAAATAACATTTAATGATAT GAGAATTGTTTGCTGTATTCTTCAGGACT TGTGGAATGTGAGGATCAGGAT ANGE1 ANGE1X4C76T TACCCTAATGTTTGTAACGGGATGGAA Y ATGGAAAAAAGGACATGTGCACTCTGCCC TCTTGCTGATGATGCCCAGGATAAAAT CAAAGATGTCGAATATAATGTCCTGTACT GCAGGGCTTCTGGAACTTAAATAACAT TTGCACAATCAGAGAATATAGCTGCTCAT TTAATGATATCTTTGAATTTTAACAAA GAGAATTGTTTGCTGTATTCTTCAGGACT TAACAACAAA TGTGGAATGTGAGGATCAGGAT ANGE1 ANGE1X4C81T GATAAAATGCAGGGCTTCTGGAACTTA Y ACTCAGTAAAATCTATTCTTCTGTAGAAA AATAACATTTAATGATATCTTTGAATT TGCAAATTTTGTCATAAAAGAGGAGCCAC TTAACAAATAACAACAAACAGGTACT CGTGGGATGTGATTTAAAAAACTGTAACA AGAATTACCACTTTTTCTGTGCCAAGAAG GACGACGCAGTTCCACAGTCTGATGGAGT TCGAGGAATTTATAAGTATTTAATAAAAC ATTTTTAAAACCACATTTGGGGGATTGGG ATAAGG ANGE1 ANGE1rs2274276 ATTGAGTGCATTCCCATATCTTTTCAC S CATGTGCCAATTTTTCTACTCAATTATTT CAATTATATTTGTTTTCCTATGACCCA ACCTGTTTTGCATTAAACTTATAATATCT ATTTGTTCATTTTTCTATTCAATGAAC TTTTTAAAAATTAACCCTTTATCATAAGT CCTCTCCCCAGAGAGTTCC GCTGCAAACACTT ANGE1 ANGE1X6rs2274277 CCTTTATCATAAGTGCTGCAAACACTT S AGTAATCTTTTCCTTTATGGATTCCAATT AGTTGAAGTTTGCCATATCTTTTGACT TTTAAATGGTTTATATTTTTAGCTAAATT TTGTAAAAACTTTTGGCATATGAGTTG TTCAGGAGTGAAAAGAAAAAGAGGAAGGA TATATTTCATGTAGTCAAA AGAAACCCCTCTC ANGE1 ANGE1X9 2 base GTTCCATGGCACATGACTGCTCTCTTT (TT) TTGAGAAATATTTATCACGATATTGAAAC deletion from CCTTTCCCCTGTTTTGGATTACATATA AATATACTGTACAGGTGATAAAAATATTT position 223 TAAATGGAGGAGTTTCTGCCTATACAA TAGAAGAATGTTCATTGTTTTCTTAAATG ACTGTTTAATATTGAAAATGTTTCTCT AGAAAAGCCAATTACAAAAACAGTATGAC CCCTCCAGACTATGAAGAAATCGGGAG CCCGCCCACCTGCCCACACACACAGGAAA TGCACTTTTTGACTGTAGATTGTTCGA AAAATATCGAGGTACATGTGCACAGACAA AGACACATTTGTAAATTTTCAAGCAGG AAGCTGCACCTGGTGGAGACAGTAGTTGG TATATGAGTTATATAACATCTGAGCAG TTGTAGGTAGTTGGAATATAGTGATTTTT CATAGT GTCTGTTCTTTTACTGCTATATTTTCCAG ATTTTCTATATACACACAAATACTTTTAT AATGAGAAAAACTCATTAAAAAAGTAGAG ACAACAAAAAATTGATTCAAAAATTGGAG CATATTTTGGCCTGTGTGTGGCCCAGGCA GGAGCTGGTAAAGCTTC ANGE1 ANGE1X9 4 base GTGCACTTTTTGACTGTAGATTGTTCG (TTAT) CACGATATTGAAACAATATACTGTACAGG deletion AAGACACATTTGTAAATTTTCAAGCAG TGATAAAAATATTTTAGAAGAATGTTCAT starting GTATATGAGTTATATAACATCTGAGCA TGTTTTCTTAAATGAGAAAAGCCAATTAC from position GCATAGTTTTGAGAAATAT AAAAACAG 234 ANGE1 ANGE1X9C241T GTTCCATGGCACATGACTGCTCTCTTT Y GATATTGAAACAATATACTGTACAGGTGA CCTTTCCCCTGTTTTGGATTACATATA TAAAAATATTTTAGAAGAATGTTCATTGT TAAATGGAGGAGTTTCTGCCTATACAA TTTCTTAAATGAGAAAAGCCAATTACAAA ACTGTTTAATATTGAAAATGTTTCTCT AACAGTATGACCCCGCCCACCTGCCCACA CCCTCCAGACTATGAAGAAATCGGGAG CACACAGGAAAAAAATATCGAGGTACATG TGCACTTTTTGACTGTAGATTGTTCGA TGCACAGACAAAAGCTGCACCTGGTGGAG AGACACATTTGTAAATTTTCAAGCAGG ACAGTAGTTGGTTGTAGGTAGTTGGAATA TATATGAGTTATATAACATCTGAGCAG TAGTGATTTTTGTCTGTTCTTTTACTGCT CATAGTTTTGAGAAATATTTATCA ATATTTTCCAGATTTTCTATATACACACA AATACTTTTATAATGAGAAAAACTCATTA AAAAAGTAGAGACAACAAAAAATTGATTC AAAAATTGGAGCATATTTTGGCCTGTGTG TGGCCCAGGCAGGAGCTGGTAAAGCTTC ANGE1 ANGE1G261A GTTCCATGGCACATGACTGCTCTCTTT R TACAGGTGATAAAAATATTTTAGAAGAAT CCTTTCCCCTGTTTTGGATTACATATA GTTCATTGTTTTCTTAAATGAGAAAAGCC TAAATGGAGGAGTTTCTGCCTATACAA AATTACAAAAACAGTATGACCCCGCCCAC ACTGTTTAATATTGAAAATGTTTCTCT CTGCCCACACACACAGGAAAAAAATATCG CCCTCCAGACTATGAAGAAATCGGGAG AGGTACATGTGCACAGACAAAAGCTGCAC TGCACTTTTTGACTGTAGATTGTTCGA CTGGTGGAGACAGTAGTTGGTTGTAGGTA AGACACATTTGTAAATTTTCAAGCAGG GTTGGAATATAGTGATTTTTGTCTGTTCT TATATGAGTTATATAACATCTGAGCAG TTTACTGCTATATTTTCCAGATTTTCTAT CATAGTTTTGAGAAATATTTATCACGA ATACACACAAATACTTTTATAATGAGAAA TATTGAAACAATATACT AACTCATTAAAAAAGTAGAGACAACAAAA AATTGATTCAAAAATTGGAGCATATTTTG GCCTGTGTGTGGCCCAGGCAGGAGCTGGT AAAGCTTC ANGE1 ANGE1X9C341T GTTCCATGGCACATGACTGCTCTCTTT Y CGCCCACCTGCCCACACACACAGGAAAAA CCTTTCCCCTGTTTTGGATTACATATA AATATCGAGGTACATGTGCACAGACAAAA TAAATGGAGGAGTTTCTGCCTATACAA GCTGCACCTGGTGGAGACAGTAGTTGGTT ACTGTTTAATATTGAAAATGTTTCTCT GTAGGTAGTTGGAATATAGTGATTTTTGT CCCTCCAGACTATGAAGAAATCGGGAG CTGTTCTTTTACTGCTATATTTTCCAGAT TGCACTTTTTGACTGTAGATTGTTCGA TTTCTATATACACACAAATACTTTTATAA AGACACATTTGTAAATTTTCAAGCAGG TGAGAAAAACTCATTAAAAAAGTAGAGAC TATATGAGTTATATAACATCTGAGCAG AACAAAAAATTGATTCAAAAATTGGAGCA CATAGTTTTGAGAAATATTTATCACGA TATTTTGGCCTGTGTGTGGCCCAGGCAGG TATTGAAACAATATACTGTACAGGTGA AGCTGGTAAAGCTTC TAAAAATATTTTAGAAGAATGTTCATT GTTTTCTTAAATGAGAAAAGCCAATTA CAAAAACAGTATGACC ANGE1 ANGE1X9C344T GTTCCATGGCACATGACTGCTCTCTTT Y CCACCTGCCCACACACACAGGAAAAAAAT CCTTTCCCCTGTTTTGGATTACATATA ATCGAGGTACATGTGCACAGACAAAAGCT TAAATGGAGGAGTTTCTGCCTATACAA GCACCTGGTGGAGACAGTAGTTGGTTGTA ACTGTTTAATATTGAAAATGTTTCTCT GGTAGTTGGAATATAGTGATTTTTGTCTG CCCTCCAGACTATGAAGAAATCGGGAG TTCTTTTACTGCTATATTTTCCAGATTTT TGCACTTTTTGACTGTAGATTGTTCGA CTATATACACACAAATACTTTTATAATGA AGACACATTTGTAAATTTTCAAGCAGG GAAAAACTCATTAAAAAAGTAGAGACAAC TATATGAGTTATATAACATCTGAGCAG AAAAAATTGATTCAAAAATTGGAGCATAT CATAGTTTTGAGAAATATTTATCACGA TTTGGCCTGTGTGTGGCCCAGGCAGGAGC TATTGAAACAATATACTGTACAGGTGA TGGTAAAGCTTC TAAAAATATTTTAGAAGAATGTTCATT GTTTTCTTAAATGAGAAAAGCCAATTA CAAAAACAGTATGACCCCG ANGE1 ANGE1X9T350C GTTCCATGGCACATGACTGCTCTCTTT Y GCCCACACACACAGGAAAAAAATATCGAG CCTTTCCCCTGTTTTGGATTACATATA GTACATGTGCACAGACAAAAGCTGCACCT TAAATGGAGGAGTTTCTGCCTATACAA GGTGGAGACAGTAGTTGGTTGTAGGTAGT ACTGTTTAATATTGAAAATGTTTCTCT TGGAATATAGTGATTTTTGTCTGTTCTTT CCCTCCAGACTATGAAGAAATCGGGAG TACTGCTATATTTTCCAGATTTTCTATAT TGCACTTTTTGACTGTAGATTGTTCGA ACACACAAATACTTTTATAATGAGAAAAA AGACACATTTGTAAATTTTCAAGCAGG CTCATTAAAAAAGTAGAGACAACAAAAAA TATATGAGTTATATAACATCTGAGCAG TTGATTCAAAAATTGGAGCATATTTTGGC CATAGTTTTGAGAAATATTTATCACGA CTGTGTGTGGCCCAGGCAGGAGCTGGTAA TATTGAAACAATATACTGTACAGGTGA AGCTTC TAAAAATATTTTAGAAGAATGTTCATT GTTTTCTTAAATGAGAAAAGCCAATTA CAAAAACAGTATGACCCCGCCCACC ANGE1 ANGE1X9G351C GTTCCATGGCACATGACTGCTCTCTTT S CCCACACACACAGGAAAAAAATATCGAGG CCTTTCCCCTGTTTTGGATTACATATA TACATGTGCACAGACAAAAGCTGCACCTG TAAATGGAGGAGTTTCTGCCTATACAA GTGGAGACAGTAGTTGGTTGTAGGTAGTT ACTGTTTAATATTGAAAATGTTTCTCT GGAATATAGTGATTTTTGTCTGTTCTTTT CCCTCCAGACTATGAAGAAATCGGGAG ACTGCTATATTTTCCAGATTTTCTATATA TGCACTTTTTGACTGTAGATTGTTCGA CACACAAATACTTTTATAATGAGAAAAAC AGACACATTTGTAAATTTTCAAGCAGG TCATTAAAAAAGTAGAGACAACAAAAAAT TATATGAGTTATATAACATCTGAGCAG TGATTCAAAAATTGGAGCATATTTTGGCC CATAGTTTTGAGAAATATTTATCACGA TGTGTGTGGCCCAGGCAGGAGCTGGTAAA TATTGAAACAATATACTGTACAGGTGA GCTTC TAAAAATATTTTAGAAGAATGTTCATT GTTTTCTTAAATGAGAAAAGCCAATTA CAAAAACAGTATGACCCCGCCCACCT ANGE1 ANGE1X9C376T GTTCCATGGCACATGACTGCTCTCTTT Y GAGGTACATGTGCACAGACAAAAGCTGCA CCTTTCCCCTGTTTTGGATTACATATA CCTGGTGGAGACAGTAGTTGGTTGTAGGT TAAATGGAGGAGTTTCTGCCTATACAA AGTTGGAATATAGTGATTTTTGTCTGTTC ACTGTTTAATATTGAAAATGTTTCTCT TTTTTACTGCTATATTTTCCAGATTTTCT CCCTCCAGACTATGAAGAAATCGGGAG ATATACACACAAATACTTTTATAATGAGA TGCACTTTTTGACTGTAGATTGTTCGA AAAACTCATTAAAAAAGTAGAGACAACAA AGACACATTTGTAAATTTTCAAGCAGG AAAATTGATTCAAAAATTGGAGCATATTT TATATGAGTTATATAACATCTGAGCAG TGGCCTGTGTGTGGCCCAGGCAGGAGCTG CATAGTTTTGAGAAATATTTATCACGA GTAAAGCTTC TATTGAAACAATATACTGTACAGGTGA TAAAAATATTTTAGAAGAATGTTCATT GTTTTCTTAAATGAGAAAAGCCAATTA CAAAAACAGTATGACCCCGCCCACCTG CCCACACACACAGGAAAAAAATAT ANGE1 ANGE1X9A492G GTTCCATGGCACATGACTGCTCTCTTT R TATACACACAAATACTTTTATAATGAGAA CCTTTCCCCTGTTTTGGATTACATATA AAACTCATTAAAAAAGTAGAGACAACAAA TAAATGGAGGAGTTTCTGCCTATACAA AAATTGATTCAAAAATTGGAGCATATTTT ACTGTTTAATATTGAAAATGTTTCTCT GGCCTGTGTGTGGCCCAGGCAGGAGCTGG CCCTCCAGACTATGAAGAAATCGGGAG TAAAGCTTC TGCACTTTTTGACTGTAGATTGTTCGA AGACACATTTGTAAATTTTCAAGCAGG TATATGAGTTATATAACATCTGAGCAG CATAGTTTTGAGAAATATTTATCACGA TATTGAAACAATATACTGTACAGGTGA TAAAAATATTTTAGAAGAATGTTCATT GTTTTCTTAAATGAGAAAAGCCAATTA CAAAAACAGTATGACCCCGCCCACCTG CCCACACACACAGGAAAAAAATATCGA GGTACATGTGCACAGACAAAAGCTGCA CCTGGTGGAGACAGTAGTTGGTTGTAG GTAGTTGGAATATAGTGATTTTTGTCT GTTCTTTTACTGCTATATTTTCCAGAT TTTCT ANGE1 ANGE1X10G247A GAGGCACGAGGATTGTGTATCTGTACC R TTTCCTAAGCCAAGAGTCATGTCAAATTG TGCAAGTGAAAGGGGCTAGAAATGGTA CAATCAGGCTCAAAACCAGAGACCAGGCT CATTATTTCTAATAAAACTACTGAATA GTGAAATCCACACATCTTTAGAACTAGTC AGCTGAAATAATTTTATTTCTTCTTTT GTCTCCTCTTGGCCTCAGCAGCTCTTCCC CATAGCAATAGAGAAAAAAATTCATGC TGTTCTTACTGGTTGACATTTTGATCACT ATCTCAACAAAGGTGGCAGCAGTTGAA CTTTGCACACTCTTGTGTTTTTTGCTCAC GGAAGAGATTGAGCTACTTCAGGACTT TGTCACATTCCCAGCACCTAGTATGCTCA AAAACAAACCTTGTGCTCTTTTCAAGA GTAAATGTTTGTGGAATAAGTGCATAAAA AAATAGAGATCTTATGTCAAGTTCTAC TGTTCTTAACCTTTGATTCTACTTACAGC ATCAATATCATCCCTGTCTTATTAGGG CCATGATAGCCTCTTAGATATAATAAATT ATTACC TGGATTATACTACTTTACTTGTACCAAAT TTGCCTGTTTTCGTGTCACAAAGTCTGCT TTTGAAAAGTCTCTTTTCAGCCACAGTTA TCACGTGGA CLLD7 CLLD7 PROM 1 CLD01 AGTGGGCAGGGAGCCAAGTTTCTCGGT R GGAGCCCGTGCCTCGCGCGTTCCTGGTTC A351G CGCTTTTCCGTCCTAGGTCTCTGGGGT CTCAGACACAAAAGCCTCTAAGTCCCGGC GGTAGGCCGACCCTCCCCACAGCCAAG AGCAGCCACCGGATTTCATGGGGACACTC CCATCTCGGGGAGCAGAGC CAGTGGCAGGGCC CLLD7 CLLD7 X3 C247T CLD703 TATTTATAGGAATTGCTTGAAGCCAGA Y GTGTGTCTTCGGCACCTCAGCCAGTGAAG V24A GTCATGGTGGATGTCGGAAAGTGGCCC CACTGTACGTTACTGACAATGATGAGGTA ATCTTCACTCTACTCTCCCCTCAAGAG AGAGTCTCTTGAAACCTAGTTTCGTTTTA ATCGCGTCTATTCGGAAGG GGACTTATTTGGC CLLD7 CLLD7 RS2274283 CLD706 CATTGCCTGTGGTCAGACTTCATCCAT R TCAGTTTCTTCACAAGGTGAATGTATCAG GGCTGTTCTGGACAATGGCGAGGTGAG ATGGAAGAAGGAAGAGTGCTTTGTTTTGG GTGTCTCCACTCCCATTTCCCTTTCTA TAGCTCCATTGAAAGTTAACTGATTACTG CCTTCTTTTTCAATTACAC ACTGAGGAGCTGG CLLD7 CLLD7 RS1536195 CLD712 ATATATCCCCTCATATTCTTAGTCCTG R TAGTATTGTGTAGTTATTATGGTTAATAG AAATGTAAAAAGCTTCAGGTGTATTGG TTTTGGACCTAGTTTATGACCATAGGCTT CTTTAATTTGGTGCTTAAAAAAGATTT TTTTGTTCTAATGTTACTGAAGGACTTTT GAAGTAAGTATCTTTAAAA TTTTTTTTAATTT CLLD7 CLLD7 RS2296502 CLD713 TATCAAGAGAGGAATTACTGTGGAGAA W CCTTTAACTCCCTTTGTGCTCGTCTCCCC TGCCTTTTCGCTATTCTCTGCTGCAGT CGCTTCTTGCGTGAGGCATGGAAAACGAT CAGATATGATGCAGAGGTAACTTAAAC GAGTCACTCCACTTCTGTTTAGGGAATCA AACAAACAGAAACCAGACC GTGAACCCATTGA CLLD7 CLLD7 X1A295T GCTGACCTCGCAGGTAGCGTGTGGGCG W GTGCCCCGTTACTCCTGTACCTCCCGCCT CGGGGTCGAGCTCGCGGAGGCCTCTTC CCCTCGCTCGGGCTCCCGGGCAGTCCCTG CCCCTCGGCCCTGCCCCTGCTTCCCTT CGGGCGCTGCGTCCGGGGCGAGGGATGGG CCCCGCAGGCCGGGCCGGG GTCCCCGGCCCGG CLLD7 CLLD7 X5 T206C GCTGATTTTTTGTGTGCTTTGTGTGCA Y GCTCCCGTCCAGGTCTGTACCAATCTCTT CAGATGGAGTGGTTTATGCCTGGGGCC GATCAAGCAAGTGGTGGAAGTAGCTTGTG ACAATGGATATAGCCAGCTTGGGAATG GCTCACATCATTCAATGGCTCTGGCAGCT GGACGACCAACCAAGGCAT GATGGAGAGGTAA CLLD7 CLLD7 X6 TTAAGAGGGTAGTTGGCATTGCCTGTG Y CTTTTTCAATTACACGTCAGTTTCTTCAC RS2274284 GTCAGACTTCATCCATGGCTGTTCTGG AAGGTGAATGTATCAGATGGAAGAAGGAA ACAATGGCGAGGTGAGGTGTCTCCACT GAGTGCTTTGTTTTGGTAGCTCCATTGAA CCCATTTCCCTTTCTACCT AGTTAACTGATTA CLLD7 CLLD7 RS 2274281 TCTATTTCGCCTGTTAAACTGTGACCA R TGTATTAAACAGGAAAGAGTGACTTGCCC AAAACTAGTACCTTTCAGAGTGTTCTG CAGACTATCTAACTAGTAAGTCATGGAGT ACATACAGTAGGTACTTAATAATTGTT AGAAACTGATGCCAATTCTGTCATCCCCT AGTGAGGCTCAACTTCCTC TCTCCAAAATAGC CLLD7 CLLD7 X7 TGTCTCTTCCTTCTTACCAGCAGTTTA Y TGGGAAACAATGGCAACCAGCTGACCCCT RS2274278 AAAATCTCACTCTCTCCCTTCTGTCCT GTGAGAGTGGCAGCTTTGCACAGCGTGTG CTCTAAGGTATATGGCTGGGGTTACAA TGTGAACCAGGTACGTGTGGTGCACTCTC TGGCAACGGTCAGCTGGGC AGTTAGTGGCTTCC CLLD7 CLLD7 X9 C219A TCAGGGTGGTAGAGATTGCAGCCTGTC M GTGATCCTCCCGCACCTCACCCACTTCTC ACTCTGCCCACACGTCTGCAGCCAAGA CTGCACCGACGACGTGTTTGCCTGCTTTG CGCAGGGTGGGCACGTGTACATGTGGG CCACTCCCGCCGTCTCGTGGCGCCTCCTG GCCAGTGCCGGGGTCAGTC TCTGTGGGTAAGA CLLD7 CLLD7 X9 C258T CGTCTGCAGCCAAGACGCAGGGTGGGC Y GACGTGTTTGCCTGCTTTGCCACTCCCGC ACGTGTACATGTGGGGCCAGTGCCGGG CGTCTCGTGGCGCCTCCTGTCTGTGGGTA GTCAGTCCGTGATCCTCCCGCACCTCA AGAAAGTGCAGGGCCACCTCCACCCAGGA CCCACTTCTCCTGCACCGA AGAATGGTACTAC CLLD7 CLLD7 X9 C285G ACGTGTACATGTGGGGCCAGTGCCGGG S GCCGTCTCGTGGCGCCTCCTGTCTGTGGG GTCAGTCCGTGATCCTCCCGCACCTCA TAAGAAAGTGCAGGGCCACCTCCACCCAG CCCACTTCTCCTGCACCGACGACGTGT GAAGAATGGTACTACCAACTGACCAGTTT TTGCCTGCTTTGCCACTCC TCCTGTGTCTTTG CLLD7 CLLD7 X13A220G AAGTTACACAGACTGCAGCATTTTGGC R AGGCTGCTGGGTTCTGTGTGAGTGCTCTG AAATGGATGGCCCTCTGCTAAAGGAAT GGGCACTGTTGAGGATGTGTCCAGTTTGT TCATTGCTAAAGCCAGTAAATGTGGAG GCTCTACGGGTGATGTGATTCTGCAGGTA CCTTTAAGAACTGAAGCGC AAAGACCATCAGG CLLD7 CLLD7 X13 C499G TAAACAACCCTTGTTTCCAGATAAGTG S TTTTCATTTTTGATTAGAGGAACTTCTGA TATTTTAAATGTGACCTTTCGTAAATT ACTTGGAAAGGGGAAGTTGGCAGCAACTC TGGGCTGGAACATGTAAAAGGGTGAAA TCCTGGGCCACGTATCAACAATCTTCTGA GTTAGTCGTTTTTGGTCTT CAGCAGGAGCAGT CLLD7 CLLD7 X13 C568T AAAAGGGTGAAAGTTAGTCGTTTTTGG Y GTATCAACAATCTTCTGACAGCAGGAGCA TCTTCTTTTCATTTTTGATTAGAGGAA GTGATAGTTGGGGAATTGTGAGATGAATG CTTCTGAACTTGGAAAGGGGAAGTTGG GAGAGGCCCCATCGTGAATTAAAGATGCT CAGCAACTCTCCTGGGCCA GACCTGGGGATTG CLLD7 CLLD7 X13.2 CATTAAGGTAGAACTTGAGAACTCAAG S TTGCCTTCCTAAATTATGACTGCTGAATC RS1062979 TTTAAATTGTCCCCCACCACCTTCTTC ATCATATCTACTATATATGACTGGAAGAA TATAAATGCAAACTTAAGAGGAAAATA GTGGTTCTTCAAGTCAACCCTCTTACAAG ATGACATGTATAGTATACT ACTTAGT:GGAAT CLLD7 CLLD7 X13.2 TCTATAAATGCAAACTTAAGAGGAAAA R GAAGAAGTGGTTCTTCAAGTCAACCCTCT RS1046027 TAATGACATGTATAGTATACTCTTGCC TACAAGACTTAGT:GGAATTTGCTTTATC TTCCTAAATTATGACTGCTGAATCATC TACTTTAGGCCAAATCCATCACACATTGG ATATCTACTATATATGACT CTTATGTGAAACT CLLD7 CLLD7 X13.2 ATGTATAGTATACTCTTGCCTTCCTAA R GACTTAGT:GGAATTTGCTTTATCTACTT RS1046028 ATTATGACTGCTGAATCATCATATCTA TAGGCCAAATCCATCACACATTGGCTTAT CTATATATGACTGGAAGAAGTGGTTCT GTGAAACTTTATCATCTTTCACTTTTGGT TCAAGTCAACCCTCTTACA TTTCCTCTGTTTT CLLD7 15 BASE DELETION CTGGAAGAAGTGGTTCTTCAAGTCAAC (CTTTA CTTTATCATCTTTCACTTTTGGTTTTCCT AT POS 684 CCTCTTACAAGACTTAGT:GGAATTTG TCATCT CTGTTTTTAAACATCTTAGGTATAACAGC CTTTATCTACTTTAGGCCAAATCCATC TTCA) ACAATTTCACCTTGAAAAAGCACCAAAAT ACACATTGGCTTATGTGAAA TATGGTGTTCCTG CLLD7 CLLD7 X13.3 GAACCTGGAGCCTGAATACATTTTCAG K GGTGGGAAGGGGAGTGAGGCCCTAAAGAT RS1046034 AGCCAAATTACAAGTGGGTGAAGACAT AGGAATTCACTGATAGCTGAAATAGATAC GCTACAACTACTATTTTTAGCAATGTT AAGCCTAGGAGCCCCAGCCCCCTTTTCTT TTTAAATTTGTGTCTATTG GACCATATCACAA CLLD7 CLLD7 RS942870 TAATTATAATTTCATCATACCTCATTT R TATGAATTAAGAGGCTTTTTTAAACATAT TTTTTGGCCAGCTGAAACAATGAATAA AAAACTGTAACTTTTTAAATTTCAATACC AATGTTACAAAAAGTAAAAATGAAAAT TTTTGGGGAACAGGTGGTTTCAGGTTACA TTCTCTGCTTACAATAACT TGGATAAGTTCTT CLLD7 CLLD7 X13.3 TTCAACTGTGAGTTTCATCATGTGTGT Y TCCCTTAGTGAATTTTGTCTCCTGCTTGG T304C CCGTTGAGCTTGCTTTTAAGAGCAATG AAAGTCTCTCTTCTGAACCTGGAGCCTGA TTTTGCTTTGCTCTCTGGTCCATGAAC ATACATTTTCAGAGCCAAATTACAAGTGG TCATTTCCATTTGAGTGAG GTGAAGACATGCT CLLD7 CLLD7 X13.4 GTTAGAATTCGTTCTACAACATGATAG Y TATAGGCTTATGATCTGAGCAAAATGTGA C1201 AACTTTTCTTACTAGCTTCAGAAATGC ACTTCAGTATGTTTACTATTGCTCTTACT ACATTGATTTGTGCTATGATGGGTGGT TGAAAACTTTTTTTCAAAAAAAGCACAAA GTTTGTAACAATCACTGTT TTAAAGTAGTAAA CLLD7 1 BASE DELETION TCTTACTAGCTTCAGAAATGCACATTG (T) CAGTATGTTTACTATTGCTCTTACTTGAA AT POS 153 ATTTGTGCTATGATGGGTGGTGTTTGT AACTTTTTTTCAAAAAAAGCACAAATTAA AACAATCACTGTTCTATAGGCTTATGA AGTAGTAAATTCATATCCATAGATAGTTC TCTGAGCAAAATGTGAACT ATTCATTCAACAA CLLD7 6 BASE INSERTION AATATAAGTGAAGGACCACTTCTCATA (TCTTT TTTTTTTAAATACACTCTCTTATTTTTCT STARTING AT TTAGATTACTAAGTCATTTGTATGAAT A) ACTTGTTTTTTTGTTAATCATAGCAGGAT POS 371 ATGTGTGGCAGTGAAGAGAACAGGTCT ATGACAACTCTTATTTGAATTGATTTTTT TTCAAAAAGCATTTGATTA CATCTAATGTAAT

TABLE 5 Peptide Fragment Protein for Ab Production ANGE1 AQASPPRPERVLGAC ANGE1 CEDQDPLNPDRSFDV ANGE1 KRKRGRKKPLSGNHC ANGE1 AQASPPRPERVLGAC ANGE1 CEDQDPLNPDRSFDV ANGE1 KRKRGRKKPLSGNHC CLLD8 DITKYREETPPRSRC CLLD8 QKEQENKSNAFPSTSC CLLD8 CPPKFSNNPKELTME CLLD8 CNEIDSRKLPQFKYR CLLD7 GDNQSTLVPKKLEGLC CLLD7 GSGSTANQPTPRKVT

TABLE 6 Putative functional promoter SNPs tested by EMSAs. binding SNP sequence SNP name Oligo name oligo sequence with binding areas type (+for −rev) T F Notes hcv98738 NR1WT-F CTCTGCCTCCCGGGTTCAAGC c/t CAGAG − GR 96 NR1WT-R GCTTGAACCCGGGAGGCAGAG a/g TGRMCC − LF- (REN34) A1 NR1MUT-F CTCTGCCTCCTGGGTTCAAGC GAGGC − T- Ag NR1MUT-R GCTTGAACCCAGGAGGCAGAG clld7 NR2WT-F AACGGGGCACTCCCGGCCCGG a/t GGGCA + LF- x1a295t NR2WT-R CCGGGCCGGGAGTGCCCCGTT t/a SCGSSSC +− GCF NR2MUT-F AACGGGGCACACCCGGCCCGG GGGGC+ T- Ag NR2MUT-R CCGGGCCGGGTGTGCCCCGTT AACGGGGCACTCCCGGCCCGG GGGRNNY NFK YCC B CCGGGCCGGGAGTGCCCCGTT function a1 SNP AACGGGGCACACCCGGCCCGG CCGGGCCGGGTGTGCCCCGTT clld7 NR3WT-F CACGGGCTCCTGCTCTGCTCC c/t CAGAG − GR prom1 NR3WT-R GGAGCAGAGCAGGAGCCCGTG a/g a351g NR3MUT-F CACGGGCTCCCGCTCTGCTCC NR3MUT-R GGAGCAGAGCGGGAGCCCGTG clld8x1 NR4WT-F TCCAACCTCCATTTCAGCCTG a/g TCCA + NF1 function a384g NR4WT-R CAGGCTGAAATGGAGGTTGGA c/t a1 SNP NR4MUT-F TCCAACCTCCGTTTCAGCCTG NR4MUT-R CAGGCTGAAACGGAGGTTGGA 

1-86. (canceled)
 87. A method for identifying an agent for treating an IgE mediated disease which modulates the activity of a CLLD8-ANGE hybrid polypeptide or a splice variant thereof comprising: providing an CLLD8-ANGE hybrid polypeptide or splice variant thereof providing a substrate; providing an agent to be tested; measuring whether the agent to be tested modulates the activity of the polypeptide by measuring processing of the substrate.
 88. A method for identifying an agent for treating an IgE mediated disease which modulates the activity of an ANGE or CLLD8-ANGE hybrid polypeptide or the activity of any splice variant thereof comprising: providing an ANGE or CLLD8-ANGE hybrid polypeptide or splice variant thereof; providing an agent to be tested; providing a cell; measuring a change in differentiation or proliferation of the cell.
 89. A method according to claim 88, wherein the change is a change in phenotype.
 90. A method according to claim 88, where the cell is expressing a polypeptide encoded by an isolated or recombinant nucleic acid molecule comprising an ANGE or a CLLD8-ANGE hybrid mRNA sequence or splice variant thereof.
 91. A method according to claim 88 where the change in cellular differentiation involves a change in expression of a cell signalling factor.
 92. A method according to claim 88, wherein the cell is a B-lymphocyte.
 93. A method according to claim 91, wherein the cell signalling factor is an immunomodulator or a peptide regulatory factor.
 94. A method according to claim 88, wherein the cell is cultured following removal from a patient or experimental animal.
 95. A method for identifying an agent for treating an IgE mediated disease which modulates the activity of ANGE or CLLD8-ANGE hybrid or any splice variant thereof comprising: providing a transgenic animal comprising a vector comprising an isolated or recombinant nucleic acid molecule comprising an ANGE or a CLLD8-ANGE hybrid mRNA sequence or splice variant thereof which results in disease; providing an agent to be tested; contacting the transgenic animal with the agent to be tested; detecting a change in the transgenic animals phenotype.
 96. A method for detecting a side effect associated with the use of an anti IgE-mediated disease agent which modulates the activity of an ANGE or ANGE-CLLD8 hybrid polypeptide or any splice variant thereof comprising: providing a cell which does not substantially express an ANGE or CLLD8-ANGE hybrid polypeptide or a splice variant thereof; providing an agent to be tested; contacting the agent to be tested with the cell; and measuring any side effect produced by the agent on the cell.
 97. A method according to claim 96 where the side effect involves a change in cell differentiation.
 98. A method according to claim 96 where the side effect involves a change in cell proliferation.
 99. A method according to claim 96 where the cell is part of a transgenic animal.
 100. A method according to claim 96 where the side effect is a measure of a change of phenotype.
 101. A method for identifying an agent for treating an IgE mediated disease which modulates the activity of ANGE or CLLD8-ANGE hybrid or any splice variant thereof comprising: providing an isolated nucleic acid sequence encoding ANGE or CLLD8-ANGE hybrid or a splice variant of either; providing an agent to be tested; measuring whether the agent to be tested modulates the activity of the isolated nucleic acid by measuring the interaction of the agent with the sample of nucleic acid.
 102. A method according to claim 101 where the screen is an in vitro transcription assay measuring transcription of ANGE or CLLD8-ANGE hybrid or of any splice variant thereof.
 103. A method for identifying an agent for treating an IgE mediated disease which modulates the activity of ANGE or CLLD8-ANGE hybrid or any splice variant thereof comprising: providing a cell producing nucleic acid encoding an ANGE or CLLD8-ANGE hybrid polypeptide or a splice variant of either; providing an agent to be tested; measuring whether the agent to be tested modulates the activity of said nucleic acid by measuring the interaction of the agent with said nucleic acid produced by said cell.
 104. A method for identifying an agent for treating an IgE mediated disease wherein said method comprises: providing an ANGE or CLLD8-ANGE polypeptide or a splice variant thereof; providing an agent to be tested; and measuring whether the agent to be tested modulates the activity of said ANGE or CLLD8-ANGE polypeptide or a splice variant thereof.
 105. A method according to claim 104 wherein the activity that is measured is activity of a methyl-CpG-binding domain or a SET domain of ANGE-CLLD8 polypeptide or a splice variant thereof.
 106. A method according to claim 104 wherein the activity that is measured is histone methyl transferase activity of a SET domain of ANGE-CLLD8 polypeptide or a splice variant thereo
 107. A method according to claim 104 wherein the activity that is measured is activity of a PHD domain of an ANGE or ANGE-CLLD8 polypeptide or a splice variant thereof.
 108. A method for identifying an agent for treating an IgE mediated disease wherein said method comprises: providing a cell producing an ANGE or CLLD8-ANGE polypeptide or a splice variant of either, providing an agent to be tested; and measuring whether the agent to be tested modulates the activity of said ANGE or CLLD8-ANGE polypeptide or a splice variant thereof produced by said cell.
 109. A method according to claim 108 wherein the activity that is measured is activity of a methyl-CpG-binding domain, a SET domain of ANGE-CLLD8 polypeptide or a splice variant thereof.
 110. A method according to claim 108 wherein the activity that is measured is histone methyl transferase activity of a SET domain of ANGE-CLLD8 polypeptide or a splice variant thereof.
 111. A method according to claim 108 wherein the activity that is measured is activity of a PHD domain of an ANGE or ANGE-CLLD8 polypeptide or a splice variant thereof.
 112. A method according to claim 87 wherein said polypeptide comprises the sequence set forth in SEQ IDS NO:27.
 113. A method according to claim 87 wherein said polypeptide comprises the sequence set forth in SEQ IDS NO:27.
 114. A method according to claim 108 wherein said polypeptide comprises the sequence set forth in SEQ IDS NO:27.
 115. A method according to claim 87 wherein said IgE mediated disease is asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis.
 116. A method according to claim 104 wherein said IgE mediated disease is asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis.
 117. A method according to claim 108 wherein said IgE mediated disease is asthma, atopy, hayfever, eczema, atopic dermatitis or allergic rhinitis. 